Informatica Cloud: How To Implement Type 1 SCD
Implementing slowly changing dimension with Informatica Cloud requires a little bit of extra effort compared to DataStage or any other ETL tools that have a change capture stage or SCD stage. This example uses hashed values to find out which records are updated, inserted or deleted. We used the CRC32 encoding which creates 10-digit integer values for the selected columns. The dimension table needs to store this value.
To compare hash values between the source and dimension tables, we used Joiner Transformation. The alternative is to use unconnected lookup to compare the value. We used AWS Redshift as source and target. Unconnected Lookup does not work on Redshift. It only works on flat file or relational connection (ODBC, MySQL, Oracle, SqlServer and MS_ACCESS).
In this example, we will create the employee_dim table from the employee and department table. Any update, insert, or delete will be reflected in the dimension table as it is Type 1.
Employee
Department
Employee_Dim
Steps
(1) Join Employee and Department in the source stage by using a custom query
(2) Configure Source to read the employee dimension table.
(3) Use CRC32 to create Hash_Value with all the relevant columns.
(4) Join Source and Target Table by the business key.
(5) Compare Hash_Value and create Change_Capture values.
(6) Use Router Transformation to separate the output into upsert and delete
(7) Map fields for upsert as below.
(8) Map field for delete as below. Only primary key is required for deletion.
Outcome
After initial run to create the dimension table, we ran update, insert and delete statements below on the source table.
Update Employee Set extension = '0000', dept_id = 'A3', effective_date = GetDate() Where empl_id = 102; |
Delete From Employee where empl_id = 109; |
Insert Into Employee Values (210, 'Blah B', '1111', 'blah@email.com', 'A1', 'Reporting', GetDate()); |
Source after running queries
Dimension Updated
0 Comments