Focal Point
[Solved] Building SCD Type 2 Using Histroic Data

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/1381057331/m/9487021236

July 11, 2013, 09:55 AM
ddornan
[Solved] Building SCD Type 2 Using Histroic Data
I'm trying to figure out how best to use DMC built-in SCD Type 2 dimension features using historic data i.e. I want to capture the change that has occured over the last 5 years.

I'm having difficulty using the SCD feature as the begin date defaults to the date and time of the dimension load and unlike the end date, there doesn't appear to be options to define this date using your source data. Has anyone else had this need?

Secondly, it looks like you need to select either select tracking type 2 SCD using begin/end dates or by using an activation flag. I like to use both date ranges and a current flag in my Type 2 SCD. Has anyone been able to incorporate both of these using the built-in SCD functionality.

Thanks,

David

This message has been edited. Last edited by: Kathleen Butler,


Webfocus 8.0
HTML
July 11, 2013, 11:05 AM
Clif
Answering your questions in reverse order.

Actually you can select BOTH activation flag and begin/end dates. I would recommend doing just that.

You can specify alternate values for SCD dates as global variables. Search online help for "Slowly Changing Dimension Variables" and you'll see that you can change the Run Date (defaults to today) as well as the end date for Type II active (defaults to NULL, many users prefer a day far in the future) and inactive (defaults to today, many users prefer yesterday).

Thus if you have a timestamped source for historic data you can run your dimension load repeatedly for all the dates in the last five years on which there were changes. Use DataMigrator's iterator capability to run for multiple run dates.


N/A
July 11, 2013, 12:16 PM
ddornan
Thanks for the quick reply Clif.

I have both the activation flag and begin/end date populating correctly. I also have been able to set the modify the start date by setting the &&CM__SCDATE in a procedure prior to the data flow as you suggested . . . Thanks.

However, if I'm trying to track something like changes to employee appointments and have historic data similiar to below:

DATE EMP POSITION SALARY
01-JUL-2010 TED SECURITY GUARD 30,000
12-DEC-2012 TED PORTER 35,000
01-OCT-2012 TED IT ANALYST 50,000
01-FEB-2012 TED MGR IT 80,000
01-AUG-2009 RALPH MGR IT 81,000
20-JAN-2012 RALPH CIO 120,000

How can I use the date field to modify the &&CM__SCDATE during the data flow. Your suggestion of running the job iteratively works when you have a series of snapshots . . . however, in this situation we would have to run the job for every unique date value . . . Is this the best way.


Webfocus 8.0
HTML
July 12, 2013, 10:15 AM
Clif
Yes, although not every day for the past five years, just those on which there were changes.

So run SELECT DISTINCT DATE FROM SOURCE to create a parameter file to use with the iterator.

That's the best I've come up with. Presumably you only getting the historic data once to get up-to-date. Then you are just processing new changes.


N/A
July 12, 2013, 12:11 PM
ddornan
Thanks for the help Clif


Webfocus 8.0
HTML
July 12, 2013, 01:26 PM
ddornan
Clif,

Is there restriction on the data type used for the dimension start and end dates . . . the reason I'm asking, is that I created a test dimension load yesterday that included setting the &&CM__SCDATE variable and it worked. Now I just ran my dimension loads none of them were able to reference the &&CM__SCDATE value that I set in a pre-flow procedure.

Here's the contents of my pre-flow procedure:

-SET &&CM__SCDATE=20100501;
-SET &&CM__SCDEADT=30000101;

Any ideas?


Webfocus 8.0
HTML
July 12, 2013, 02:02 PM
ddornan
Ignore the last post.

I figured out what I did differently today and yesterday. Yesterday I put the pre-flow process within same data flow file. Today, I created a master flow process and tried to set the &&CM__SCDATE in a procedure and then called another data flow.


Webfocus 8.0
HTML
July 31, 2014, 06:35 AM
VS4
Hi,
I have implemented SCD2 Type using in built IWAY functionality. I am using all SCD fields which are current flag,begin date , end date.
but my problem is after the first run for any subsequnt change to the record I want the startdate for the new record to be the end date of the previos record.

Please suggest how can I implement this ?

Regards
Venkat

This message has been edited. Last edited by: VS4,


WebFOCUS8,IWAY DM
Windows, All Outputs
July 31, 2014, 11:21 AM
Clif
That's the default behaviour. When a Type II change is detected the end date of the then current record is changed from NULL (by defualt) to TODAY (by defualt); the new record is inserted with begin date of TODAY and end date of NULL. It's now the current (active) record.


N/A
July 31, 2014, 12:19 PM
VS4
Yes thanks Clif.It was my mistake earlier while passing the varibles , its working fine now.

Regards
Venkat


WebFOCUS8,IWAY DM
Windows, All Outputs