Challenge
When
we group by Doctor to get the total
amount and Total patient count which are under
the doctor the total amount is correct because there is a entry in fact
table for this. Consider the case of total patient under doctor that is
incorrect because not all patients have a bill amount. For this we need to
design a factless fact table
Factless facts
are of two
types -- EVENT TRACKING and COVERAGE
a event tacking only says who took part in the event while a coverage
allows you to answer who did not take part
Answer is
Simple -- Create a Factless fact table which has entry for each
doctor and patient . Sum up the values for Provider you will get number of
patients . Then sum up the billing fact details based on provider . Then join
both facts based on provider keys and then join them with dimension.
The
reason for joining with dimensions later is size of the tables for joining
would increase and it would require additional hash joins. Hash joins are CPU
intensive and will slow down your system also consider your dimension is 1GB
with 50000 doctors and 200 million records for fact you are repeating the
doctor name for each of 200 million records and doctor name is 100 KB field
then you are adding that much size so your temp size the amount of memory for
PGA everything will increase this will cause a slowdown.
No comments:
Post a Comment