Sunday, November 28, 2010

Bitmap Indexes



Database Version: Oracle 10g
Operating System: Windows XP

I was working on tuning a procedure in my current company, when I found something interesting with my colleague. He has created a table with a bitmap index for one of the columns. The insert into the table was found to be very slow and when I checked in the process, he is actually doing a single row operation (of around 50K every day) which is actually taking couple of minutes to insert. There was actually an argument with him and he claims that the slow in insert operation is not due to bitmap index and it is because of something else. Further he claimed that BITMAP index is similar in B*Tree index. Not only him, many could not understand when we should create BITMAP INDEX and what actually it is.

In this article, I will explain what is BITMAP Index, their operational difference (or comparison) and when do we need to create BITMAP indexes. As always said, please do provide your comments.

What is BITMAP index?

Oracle created BITMAP index to cater the need for data warehousing and adhoc query mechanism. In general, usage of bitmap index in OLTP is not a highly advisable solution. In a bitmap index, a bitmap for each key value replaces a list of rowids. Each bit in the bitmap corresponds to a possible rowid, and if the bit is set, it means that the row with the corresponding rowid contains the key value. Oracle has created a proprietary mapping function that converts the bit position to an actual rowid, so that the bitmap index provides the same functionality as a regular index. Bitmap indexes store the bitmaps in a compressed way.

Bitmap indexes are most appropriate on low distinct cardinality data (i.e., data with relatively few discrete values when compared to the cardinality of the entire set. It is difficult to define what low distinct cardinality is?. This is data where the number of distinct items in the set of rows divided by the number of rows is a small number (near zero). For example, in a table of 10000 rows, 1 or 2 distinct valued column would be appropriate for a bitmap index. However, in a table of 1000000 rows, 100 or 200 distinct valued column would be appropriate for a bitmap index. I have seen people telling me that columns with more than 5 or 10 distinct columns cannot participate in a bitmap index, which is actually wrong.

In general, when the data is packed for bitmap index, it is not packed like b*tree index. In case of b*tree index, the leaf node will have a list of rowids. In case of bitmap index, the leaf node contains bitmap, with the value set if the key value is available and not set if not. Each leaf node contains the entry header, key, start rowid, end rowid and bitmap segment. A bitmap segment consisting of a string of bits. The bit is set when the corresponding row contains the key value and is unset when the row does not contain the key value. Entry header, which contains the information on the number of columns and lock information. The Oracle server uses a patented compression technique to store bitmap segments and also bitmap Index uses restricted rowid format

Difference between Bitmap and B*Tree Indexes

Bitmap indexes actually store pointers to many rows with a single index key entry, as compared to a B*Tree. In B*Tree index, we will be able to pair the index keys and the corresponding rows in a table. In a bitmap index, there will be a very small number of index entries, each of which points to many rows. In a conventional B*Tree, one index entry points to a single row.

When do we need BITMAP index?


We need BITMAP index for data warehousing application, Adhoc Query Processing, and when the data load is quiet large(and we have bulk processing). We should never use a BITMAP index for OLTP processing. We shall consider couple of scenarios where in we check on the usage of BITMAP index and study the response. On the first case, I will show what happens if we use BITMAP index for row-by-row insert and on the second case, we shall see the operational difference between b*tree index and Bitmap Index.

Part 2     Part 3   Part 4

No comments: