Search notes:

Data modeling

Traditional modeling techniques / structured data

The traditional modelling techniques as outlined by Kimball and Inmon are focussed on modelling structured data.

Kimball / dimensional modeling

Kimball's approach is also known as dimensional modeling
The key components of Kimballs' dimension modelling are fact and dimensional tables.
A fact table stores numerical values (such as sales revenue etc.). In SQL, these values are typically aggregated with aggregate functions.
A dimensional table store the entities on which the numerical values are aggregated with the group by SQL statement.
A fact table has foreign keys that reference the corresponding primary keys of the dimension tables.
There are three types of dimensional modeling:
  • Star model (which is really a special -denormalized- case of the snowflake model)
  • Snowflake model (where a star model's dimensions are normalized into mulitple related tables)
  • Multi-star model
Because the snowflake model is normalized, it tends to require less space, but requires ETL processes to make sure that data is adequatly loaded.
The goal of the star model compared to the snowflake model is to make queries faster and joining tables easier.

Two layererd data load process

Kimball proposes a two layered data load process consisting of

Challenges

According to a study by Gartner Group in 2011, 80% of then data's is unstructured An Advanced Unstructured Data Repository, X. Liu, B. Lang et al
The rise of unstructured and semi-structured data (text documents, images, video, IoT etc.) along with the increase of stored data volume in general, these traditional modelling techniques face growing pressure.

Models for unstructured data

Models that are able to deal with unstructured data include

Conceptual vs. physical vs. logical data models

Found at https://www.1keydata.com/datawarehousing/data-modeling-levels.html
Feature Conceptual Logical Physical
Entity names
Entity relationships
Attributes
Primary keys
Foreign keys
Table names
Column names
Column data types

See also

DWH modelling

Index