Wednesday, 10 July 2013

Sql tuning with Bitmap indexes and Star schema transformation

Hi Guys

This Article is relavent only if you are using OLAP (Star schema/Snowflake) .Not if you are using OLTP systems(Online transcation processing) 

General note on Indexes -- B tree index (ones that we use on primary keys) works best when you have unique values. like emp_no . When you dont have unique values they still work good if your query is such that it retrieves less than 20% of data. (eg select * from emp where gender = Male and 20% of your staff is Male so only 20% of rows are retrieved )

If you query retrieves more than 20% data then oracle decides to use full scan instead of indexes. Bitmap indexes are slightly better because oracle still decides to use them even if your query retrieves 40% of total number of rows.

If you query is such that it retrieves more than 40% of rows then oracle will skip bitmap and go for full scan. The advantage bitmap have is rowid are sorted so it know which blocks of data to easily pick


You should remember that bitmap stores only bit value with rowid and the rowid are sorted
  
Rowid
Value of row
Rowid 1
Rowid 2
Rowid 3
Rowid 4
Rowid  5
male
1
1
0
1
0
Female
0
0
0
0
1

When to use B tree or Bitmap indexes

The comparison is not straight forward and it depends on the distribution of data obviously . Like a table which is having emp id (which is unique) distributed randomly wil perform poorly with a B tree index for range scan because of clustering factor . Because of random distribution of data it has to fetch multiple blocks .

Note - the word range scan is used because for equality predicate anyway it has to fetch a single block so it does not make a difference whether bitmap or b tree index

So here is one more question so why this random distribution of data does not apply to bitmap index and it works well . Even bitmap has to get rowid and then go to table to fetch blocks with those row id what is different.

Contrary to popular belief consider a table with 1 million entries and a column for male and female with bitmap index on it. 1/2 million male  and 1/2 million female and you filter by = male will oracle use bitmap index. Most probably it will go for  full scan because even with bitmap it has to get rowid and then fetch those blocks from table

For B tree index oracle uses 5-20% rule that is the data retrieved is within 20% it will go for index otherwiser full table scan is good . Same is case for bitmap but the range is more may be 50%. But consider out of 1 million 800 thousand are males then oracle will prefer full scan instead of bitmap.

So even the male female example for bitmap is not 100% true. You need to understand the data.


Bitmap index are even applied for null values whereas b tree indexes are not applied


If there are two bitmap indexes they can be combined to reduce number of records and then the records can be fetched this is the concept of star transformation

Found some interesting articles on Oracle capabilities for your datawarehouse .How to improve performance

Oracle document on Datawarehouse query improvement

Will write later after implementation.Planning to implement solution

Good Article on Bitmap and B-Tree indexes


http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html

Hints Like /*+ STAR TRANFORMATION */  has huge impact 

Point to be remembered -- Bitmap indexes work together in finding the row required. You cannot have the same columsn with bitmap index and same column with B tree index and compare performance. 

If you have bitmap index on all your dimension seq key in your fact then when you filtered by dimension it can combine those bitmap indexes in finding the correct row. In case of B tree indexes it cannot combine them to get correct row. The more bitmap indexes you have and the more filters you apply on those column result will be faster.

It combines result of each index for zooming to output. 

 
 

No comments:

Post a Comment