Welcome to plsql4all.blogspot.com SQL, MYSQL, ORACLE, TERADATA, MONGODB, MARIADB, GREENPLUM, DB2, POSTGRESQL.

Tuesday, 28 May 2024

Snowflake Schema

A Snowflake Schema is a type of database schema that is an extension of the star schema. It organizes data into a central fact table connected to multiple dimension tables, which are further normalized into multiple related tables, creating a snowflake-like structure. This normalization eliminates redundancy and reduces the disk storage requirements.


Key Characteristics:-


1. Normalized Dimension Tables:

   - Unlike the star schema, where dimension tables are typically denormalized, in the snowflake schema, dimension tables are normalized into multiple related tables.

   

2. Hierarchical Structure:

   - The snowflake schema has a more complex, hierarchical structure due to the normalization of dimension tables.


3. Improved Data Integrity:

   - Normalization reduces redundancy, leading to improved data integrity and consistency.


4. Complex Queries:

   - The snowflake schema can handle complex queries, though they may be slower due to the need for multiple joins.


5. Reduced Storage:

   - By normalizing the tables, the snowflake schema uses less disk space compared to a star schema.


Example:-


Consider a sales data warehouse. In a star schema, you might have a single dimension table for each dimension. In a snowflake schema, these dimension tables are further normalized.


Star Schema:

- Fact Table: Sales (sales_id, product_id, customer_id, time_id, amount)

- Dimension Tables:

  - Product (product_id, product_name, product_category)

  - Customer (customer_id, customer_name, customer_address)

  - Time (time_id, date, month, year)


Snowflake Schema:

- Fact Table: Sales (sales_id, product_id, customer_id, time_id, amount)

- Dimension Tables:

  - Product:

    - Product (product_id, product_name, category_id)

    - Category (category_id, category_name)

  - Customer:

    - Customer (customer_id, customer_name, address_id)

    - Address (address_id, street, city, state)

  - Time:

    - Time (time_id, date, month_id, year)

    - Month (month_id, month_name, quarter_id)

    - Quarter (quarter_id, quarter_name, year)


Advantages:-


1. Reduced Redundancy: Normalization eliminates redundancy and leads to efficient storage.

2. Improved Data Integrity: Reducing data redundancy enhances data consistency and integrity.

3. Flexible Design: Better suited for handling complex queries and data models.


Disadvantages:-


1. Complexity: More complex than star schemas, making it harder to understand and manage.

2. Performance: Queries can be slower due to multiple joins required to retrieve data from normalized tables.

3. Maintenance: More tables mean higher maintenance effort, especially when updating or modifying the schema.


Use Cases:-

Snowflake schemas are typically used in data warehouses where the goal is to reduce storage costs and improve data integrity. They are suitable for complex queries that require detailed data analysis and reporting.


The snowflake schema is a powerful design for organizing and structuring data in a database, especially when data integrity and storage optimization are crucial. However, its complexity can make it challenging to implement and maintain compared to simpler schemas like the star schema.

No comments:

Post a Comment

Please provide your feedback in the comments section above. Please don't forget to follow.