I won’t go into this loads since it’s a fairly well established Kimball modelling approach for cubes…
Essentially what do you do if the grain of a dimension is lower than the fact grain when denormalised directly to the fact table? Since if you design it that way the resulting model will double count the measures without slow, unsuable and complex adjustments to query patterns. That’s the generic definition of the modelling problem or pattern.
This requires a many-to-many dimension model using a factless fact or a bridge table. Essentially we identify the other dimension entity that bridges the lower grain dimension onto the fact, create a bridge table and join it to the fact through the bridge table. We can then hide the bridge table and the users are completely unaware of this complexity and the cube engine takes care of the measure aggregation automatically.
All well and good if you have a good grasp of this design pattern. A classic use case that is more intuitive to think of is Customers having many Accounts and Accounts having many customers. Transactions are at an Account level but customers relate to accounts at lower level. However we don’t want to double count transactions when we aggregate a total.
So in cubes:
- SSAS Multi-Dimensional – has built in dimension relationship type for this feature
- SSAS Tabular – you had to get around it using DAX
- SSAS Tabular 2016, 2017 & Azure – Now also has a built-in relationship feature that handles this automatically – You don’t need to use DAX anymore or create several role playing customer dimensions.
Finally note this is also in Power BI; but not in Excel which is on a slower release cycle! You still have to use DAX last I checked in Excel. Hopefully Excel Power Pivot Models will catch-up soon.
Here’s a simple example of how it’s modelled:
And here it is in Excel; see that accounts total correctly.
When we break it out by customer we see that it aggregates correctly at the lower grain for individual customers but the total does not double count. E.g. Both Shaun and Sarah have 15 in account 001 but the total is 15 for account 001 because they share the same account and it doesn’t double count at the account level. Finally the grand total is 75 again because the engine doesn’t double count the measures for the total.
Again just by customer we see that it aggregates properly at the lower grain but the total is 75 not 90 which would be wrong. i.e. it doesn’t not double count the facts at customer level and shows the total correctly at account level.