Thursday, August 8, 2013

Slowly Changing Dimension

Slowly Changing Dimension

  • Often called as SCD, where the data of a dimension is changing slowly
  • Available in different types (SCD Type 0, Type 1, Type 2, Type 3, Type 6)
  • 1, 2 & 6 are most common

Type - 0

  • Where dimension changes are not considered
  • Even if the value gets changed in real scenario. 
  • It does not carry to the dimension and dimension still holds the old data

Type - 1

  • Where history is not maintained and table always show the latest data
  • Always updated with recent values
  • e.g. Before Change

Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
CA

After Change

Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
IL


Type - 2

  • Tracks historical updates by creating separate rows
  • Tracking can be done in 2 ways
  • 1st method is via Version

Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
Version.
123
ABC
Acme Supply Co
CA
0
124
ABC
Acme Supply Co
IL
1

So latest version # is the latest row, and that's how we preserve the history

  • 2nd method is via StartDate & EndDate

Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
Start_Date
End_Date
123
ABC
Acme Supply Co
CA
01-Jan-2000
21-Dec-2004
124
ABC
Acme Supply Co
IL
22-Dec-2004
NULL

Row with NULL EndDate is the latest row.

Type - 3

  • Tracks historical updates by creating separate columns
  • Type-2 is horizontal growth and Type-3 is vertical growth

Supplier_Key
Supplier_Code
Supplier_Name
Original_Supplier_State
Current_Supplier_State
123
ABC
Acme Supply Co
CA
IL

  • Due to difficult maintenance of vertical growth, only 1 level history is being tracked generally.
  • Let's say now state gets changed from IL to WA, then this is how the data will look

Supplier_Key
Supplier_Code
Supplier_Name
Original_Supplier_State
Current_Supplier_State
123
ABC
Acme Supply Co
IL
WA

So we lose the prior history.

Type - 4

  • Tracks history in a separate table
  • "Supplier" table
Supplier_key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
IL
  • "SupplierHistory" table
Supplier_key
Supplier_Code
Supplier_Name
Supplier_State
Create_Date
123
ABC
Acme Supply Co
CA
22-Dec-2004

  • One table will maintain the latest data 
  • Other will keep on maintaining historical updates with CreationDate

Type -6

  • It is hybrid of Type-1,2 & 3
Supplier_Key
Supplier_Code
Supplier_Name
Current_State
Historical_State
Start_Date
End_Date
Current_Flag
123
ABC
Acme Supply Co
IL
CA
01-Jan-2000
21-Dec-2004
N
124
ABC
Acme Supply Co
IL
IL
22-Dec-2004
31-Dec-9999
Y

  • StartDate, EndDate will identify during which period that record was active
  • CurrentFlag will identify which is the latest record and which is the historical record
  • CurrentXYZ column will always hold the latest value for all the records

No comments:

Post a Comment