Tuesday 31 December 2013

Dimensional Modelling Notes 2


How is a confirmed dimension where both the facts it is connected to are at different granularity.

Ans-- in this case it has to be a snowflake dimension. Otherwise there is no chance of joining two facts which are at different granularity. ( you can go for min function on the the hierachy then it can be done but tricky)
Example - consider one fact is at month level and other is at date level and they are connected by common date dimension

How can one time dimension be connected to two facts one at month and one at day level

Concept of bridge tables .. Why and when
In one line
single fact measurement is associated with multiple occurrences of a dimension, such as multiple customers associated with a single bank account balance

Bridge tables are used in two more complicated scenarios. The first is where a many-to-many relationship can’t be resolved in the fact table itself (where M:M relationships are normally handled) because a single fact measurement is associated with multiple occurrences of a dimension, such as multiple customers associated with a single bank account balance. Placing a customer dimension key in the fact table would require the unnatural and unreasonable divvying of the balance amongst multiple customers, so a bridge table with dual keys to capture the many-to-many relationship between customers and accounts is used in conjunction with the measurement fact table. Bridge tables are also used to represent a ragged or variable depth hierarchical relationship which cannot be reasonably forced into a simpler fixed depth hierarchy of many-to-one attributes in a dimension table.


•Many employees can have multiple jobs and each job can be performed by multiple employees
•Many patients can have multiple diagnosis and each diagnosis can be 'assigned' to many patients
•Many calls can have multiple call ticket types and each ticket type can belong to multiple calls

In the below diagram the money is given to the group of employees doing the job. It is not divided between individual employees in fact table so how will you store the relationship. If the money was distributed to individual employee then we would have hierarchy in employee table called employee group which can be drilled to employee
( What happens when a employee belongs to more than one hierachy ??? )

Very good article below on bridge tables


Notice that the payment is made for group of employees and not for individual employee so actually 100 is for both emp 1 and emp2 but when we add up across fact it will add up incorrectly.

So we need to create bridge table between employee and job type and it will have weighting factor also and we need to remove the join between join type and fact. This gets rid of many to many relationships

We cannot create bridge table between employee and fact because emp group will repeat as multiple employees are there in a group and one row for each employee so while joining with fact you will not have a primary key to join with

 

Bridges -- What is the cost of this bridge table to the entire query ???



How can you rollup two fact which are at different grains, stiched queries
Ans - Full outer joins

What are the performance implication of joining two facts at different grains


What is the impact of joining snapshot table and regular fact
Creating DM on DW using views to access DM beats purpose as you can never see history data
SCD type1 , type2 , type3 and hybrid models
When to use SCD type3 example with query
Bridge table and many to many relationships
Can two dimensions be combined what are the things to consider
When do you want to create a table holding dimension keys and to bypass fact
Junk dimension - in retails case study -- credit or cash
If you have 2 columns as junk and each of them takes 2 values then there are 4 rows junk dim will store all rows combination so the question is whether you want to store all the combination in advance  or insert rows for them as they come

What is the difference between degenerate and junk -- junk dimension are left in fact and junk dimension have their own dimension . Junk dimension values are usually low cardinaltiy yes or no

No comments:

Post a Comment