Kimball / dimensional modeling
Kimball's approach is also known as dimensional modeling
The key components of Kimballs' dimension modelling are fact and dimensional tables.
A
fact table stores numerical values (such as
sales revenue etc.). In
SQL, these values are typically aggregated with
aggregate functions.
A
dimensional table store the entities on which the numerical values are aggregated with the
group by
SQL statement.
There are three types of dimensional modeling:
- Star model (which is really a special -denormalized- case of the snowflake model)
- Snowflake model (where a star model's dimensions are normalized into mulitple related tables)
- Multi-star model
Because the snowflake model is normalized, it tends to require less space, but requires ETL processes to make sure that data is adequatly loaded.
The goal of the star model compared to the snowflake model is to make queries faster and joining tables easier.
Two layererd data load process
Kimball proposes a two layered data load process consisting of
Challenges
According to a study by Gartner Group in 2011, 80% of then data's is unstructured An Advanced Unstructured Data Repository, X. Liu, B. Lang et al
The rise of unstructured and semi-structured data (text documents, images, video, IoT etc.) along with the increase of stored data volume in general, these traditional modelling techniques face growing pressure.