Thinking Out Loud

May 13, 2011

Index Not Being Used – Fragmentation

Filed under: oracle — mdinh @ 3:41 am

I was experiencing FTS on SELECT for 1 record.

SQL> select A.vk_accn_test_seq_id, count(*) FROM accn_phys A group by A.vk_accn_test_seq_id;

VK_ACCN_TEST_SEQ_ID   COUNT(*)
------------------- ----------
            2268482          1
            3765572          1
            1038777          1
            2053978          1
            3639618          1
            3940603          1
             628110          1 +++
            2639137          1
                  0     469892

9 rows selected.

SQL> SELECT pk_seq_id FROM accn_phys A WHERE A.vk_accn_test_seq_id=628110;

 PK_SEQ_ID
----------
     95100

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 3167157579

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 51408 |   401K|   704   (0)| 07:27:00 |
|*  1 |  TABLE ACCESS FULL| ACCN_PHYS | 51408 |   401K|   704   (0)| 07:27:00 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"."VK_ACCN_TEST_SEQ_ID"=628110)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4662  consistent gets
          0  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select column_name, num_distinct, num_nulls, num_buckets, density, histogram
     from user_tab_col_statistics
     where table_name = 'ACCN_PHYS' and column_name='VK_ACCN_TEST_SEQ_ID'
     order by column_name;  2    3    4    5    6    7    8    9

COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS NUM_BUCKETS    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ----------- ---------- ---------------
VK_ACCN_TEST_SEQ_ID                       9          0           1 .111111111 NONE

SQL> select index_name from user_ind_columns where table_name='ACCN_PHYS' and column_name='VK_ACCN_TEST_SEQ_ID';

INDEX_NAME
------------------------------
I_ACCN_PHYS_ACCNTESTSEQID

SQL> analyze index I_ACCN_PHYS_ACCNTESTSEQID validate structure;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows, del_lf_rows from index_stats;

    HEIGHT     BLOCKS    LF_BLKS    LF_ROWS    BR_BLKS    BR_ROWS DEL_LF_ROWS
---------- ---------- ---------- ---------- ---------- ---------- -----------
         3       2176       2100     471150          8       2099        1248

SQL>

NOTICE: the large number of DEL_LF_ROWS

On QA system: REBUILD Index

SQL> alter index I_ACCN_PHYS_ACCNTESTSEQID rebuild;

Index altered.

SQL> analyze index I_ACCN_PHYS_ACCNTESTSEQID validate structure;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows, del_lf_rows from index_stats;

    HEIGHT     BLOCKS    LF_BLKS    LF_ROWS    BR_BLKS    BR_ROWS DEL_LF_ROWS
---------- ---------- ---------- ---------- ---------- ---------- -----------
         3        832        805     444305          3        804           0

SQL> SELECT pk_seq_id FROM accn_phys A WHERE A.vk_accn_test_seq_id=628110;

 PK_SEQ_ID
----------
     95100

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 234933029

---------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                           | 63472 |   495K|   735   (1)| 00:00:09 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ACCN_PHYS                 | 63472 |   495K|   735   (1)| 00:00:09 |
|*  2 |   INDEX RANGE SCAN          | I_ACCN_PHYS_ACCNTESTSEQID | 63472 |       |   120   (2)| 00:00:02 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."VK_ACCN_TEST_SEQ_ID"=628110)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          2  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>
Advertisements

5 Comments »

  1. Given 471,150 LF_ROWS, I wouldn’t say that 1248 DEL_LF_ROWS is particularly large. About a quarter of a percent.
    But with around 15 bytes needed for each index leaf entry (the sequence number and rowid), you should be able to fit 500 in a block. 500,000 would fit in about a thousand blocks. So the 2100 LF_BLKS before the rebuild is more notable. The 800 after the rebuild is considerably more compact.

    By removing half the leaf blocks, the cost of an index range scan is halved.
    Interestingly, the 735 cost for the query after the rebuild is higher than the 704 cost for the full scan before the rebuild.

    Personally I would have gone for getting histograms on the index stats rather than a rebuild. The optimizer estimates are still way off based on the column skew.

    Comment by Gary — May 13, 2011 @ 4:25 am | Reply

  2. Hi

    It’s also somewhat interesting to me that before the rebuild the index had got the 471150 rows that Gary refers to above, but afterwards it has lost 26845 rows entirely ( 5% of the total rows in the index.) If this is a production system perhaps something else is going on at the same time? If the distribution of values shown is typical (most rows have 0 and only a few have a non zero value for the test_id column then it would be a classic candidate for histograms.

    Niall

    Comment by Niall Litchfield — May 13, 2011 @ 5:33 am | Reply

  3. Just a quick further point.

    The numbers for leaf rows and costs are exactly consistent with the table having the 444305 rows after the validate structure divided amongst only 7 distinct values, rather than the 9 before. That’s a fairly significant change with data as skewed as this (and explains the cost difference Gary observed)

    Comment by Niall Litchfield — May 13, 2011 @ 5:39 am | Reply

  4. Histogram is one option, function-based index which doesn’t index 0 another. Dynamic sampling yet another.

    Comment by Dom Brooks — May 13, 2011 @ 8:05 am | Reply

  5. Thanks everyone! I remembered trying histogram, but must have remembered wrong. Going back adding histogram with 9 buckets, the SQL does use index.

    Comment by mdinh — May 14, 2011 @ 12:18 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: