Slowly Changing Dimensions (SCDs) refer to the concept of managing and tracking changes in dimension data over time within a data warehouse. A dimension is typically a descriptive attribute related to facts in a data warehouse, such as customer, product, or location information. Since this data can change over time (e.g., a customer changes their address), it's crucial to handle these changes appropriately in the data warehouse to maintain accurate historical data.
Types of Slowly Changing Dimensions:-
There are primarily three types of Slowly Changing Dimensions:
1. Type 1 (Overwrite):-
- Definition: In Type 1 SCD, when a change occurs in the dimension data, the old value is overwritten with the new one. This method does not maintain any historical data.
- Use: Type 1 is used when historical accuracy is not important or when only the current value of the data is needed.
- Example:
- Scenario: A customer’s last name is spelled incorrectly in the database.
- Action: When the error is corrected (e.g., "Jonhson" corrected to "Johnson"), the old value "Jonhson" is overwritten with the correct value "Johnson", and no history of the old value is kept.
2. Type 2 (Add New Row):-
- Definition: In Type 2 SCD, a new row is added to the dimension table each time a change occurs. This method maintains a full history of all changes.
- Use: Type 2 is used when it is important to keep track of historical data, such as tracking the history of a customer’s address over time.
- Example:
- Scenario: A customer moves from one city to another.
- Action: A new row is added to the dimension table with the new address, while the old address is retained in the existing row. Additional columns, such as Start_Date and End_Date, may be used to indicate the period during which each address was valid.
3. Type 3 (Add New Column):-
- Definition: In Type 3 SCD, a new column is added to the dimension table to store the old value when a change occurs. This method typically tracks limited historical information, often only the previous value.
- Use: Type 3 is used when changes are infrequent, and only the most recent change needs to be tracked along with the current value.
- Example:
- Scenario: A product changes its category from "Electronics" to "Home Appliances."
- Action: A new column is added to the dimension table (e.g., Previous_Category), which stores the old value Electronics, while the Current_Category column holds the new value Home Appliances.
Real-Life Examples of Slowly Changing Dimensions:-
Example 1: Customer Address Change (Type 2):-
- Scenario: A bank maintains customer information, including addresses. Over time, customers move to new locations.
- Implementation: Using Type 2 SCD, each time a customer changes their address, a new record is added to the `Customer` dimension table with the updated address. This approach ensures that the bank can analyze data based on the customer's location at any given time in the past.
Example 2: Employee Role Change (Type 3):-
- Scenario: An HR system tracks employee job titles. Occasionally, employees get promoted or move to different roles within the company.
- Implementation: Using Type 3 SCD, when an employee’s job title changes, the old title is stored in a new column (e.g., Previous_Job_Title), while the current title is stored in the existing Current_Job_Title column. This allows HR to easily see both the current and previous roles of an employee.
Example 3: Product Price Update (Type 1):-
- Scenario: A retail company updates the prices of its products periodically.
- Implementation: Using Type 1 SCD, whenever the price of a product is updated, the old price is overwritten with the new price in the `Product` dimension table. Historical pricing information is not retained since only the current price is relevant for day-to-day operations.
Use of Slowly Changing Dimensions:-
- Historical Data Tracking: SCDs are crucial for maintaining accurate historical data, which is vital for trend analysis, auditing, and reporting.
- Accurate Reporting: They enable accurate and consistent reporting, ensuring that business decisions are based on complete historical data.
- Compliance and Auditing: Certain industries require tracking changes over time to comply with regulatory standards, making SCDs essential for compliance.