Thursday, August 8, 2013

Rapidly Changing Dimension

  • Dimension where data changes are happening very rapidly
  • "Slowly Changing Dimension" blog tells how we can maintain the history of changes
  • But if we really implement the same technic for RCD, it will end up creating N number of Type-2 records for same entity and hence we need to come up with a smart design
  • We can handle this by creating Junk Dimension
  • Let's take an example of Customer Dimension, assume we have following columns
    • CUSTOMER_KEY
    • CUSTOMER_NAME
    • CUSTOMER_GENDER
    • CUSTOMER_MARITAL_STATUS
    • CUSTOMER_TIER
    • CUSTOMER_STATUS
  • Now, Name, Gender, MaritalStatus gets a change very rare, once in a while but Tier & Status get the change very frequently, assume that we are not interested in historical changes of Name, Gender & Marital Status, holding only latest value should be fine, but we need to track the history changes of Tier & Status
  • Now create a junk dimension by removing Tier & Status from Original Dimension
  • So our Original Customer Dimension will be like this
    • CUSTOMER_KEY
    • CUSTOMER_NAME
    • CUSTOMER_GENDER
    • CUSTOMER_MARITAL_STATUS
  • Create a new junk dimension for Tier, Status and create another mapping dimension, so overall structure will look like this



  • By doing so, new type-2 record generation will happen only in a mapping table and not in the main table, so that way we can have easy mainteanance.
  • Only disadvantage is - it will introduce extra joins to get the full picture of history changes of a particular customer
  • But maintenance and storage vise, we will get benefits

No comments:

Post a Comment