Null - sentenced to full scan?


PL

Null - some try to avoid it, some use it without fear. Null is a permanent element of the database landscape.

Is null causing problems? Sometimes yes. But if treated properly, it will valuable part of the system.

One of the more common problems is the inability to index null. Sometimes we would like to search for records with a null field. Of course, we can do it but we have to wait a bit longer - Oracle will search all table records (full scan).

Let's look at the null on the example of the HR scheme.

In the EMPLOYEES table we have the MANAGER_ID column pointing to the manager ID. The president obviously has no superior - so this column is empty (null) for him.

We start with creating index on the column MANGER_ID  (this index is created by default in the HR schema).

create index EMP_MANAGER_IX on employees( manager_id);

EXPLAIN PLAN FOR select * from employees where manager_id = 148;

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                |     6 |   456 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES      |     6 |   456 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | EMP_MANAGER_IX |     6 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------


The specific value of the MANAGER_ID column was found using the index. What if we want to search for a president who has null in the MANAGER_ID field?

EXPLAIN PLAN FOR select * from employees where manager_id is null;

select * from table(dbms_xplan.display);

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    76 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPLOYEES |     1 |    76 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------


Unfortunately - null values ​​are not indexed, so each Null search will end up searching the entire table - full scan.

However, don't tear your clothes and don't remove every null from your system! Just because something is impossible doesn't mean it can't be done! There are, of course, several ways!

Method 1
When creating an index - add a constant as an additional indexed value. This constant will guarantee that all rows in the table will be indexed, including those that are null.

create index mgr_id_1_idx on employees( manager_id, 1);

EXPLAIN PLAN FOR select * from employees where manager_id is null;

select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |     1 |    76 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES    |     1 |    76 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | MGR_ID_1_IDX |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------



Well, we forced the database to use the index when searching for null! No more full scan!

Method 2
You can also use the function index to index null values ​​and simply replace the null with a specific value.

WARNING! When searching, use the value exactly!

create index mgr_id_nvl_idx on employees (nvl2(manager_id, 1, 0)) ;

EXPLAIN PLAN FOR select * from employees where manager_id is null;
select * from table(dbms_xplan.display);

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    76 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPLOYEES |     1 |    76 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------

EXPLAIN PLAN FOR select * from employees where (nvl2(manager_id, 1, 0)) = 0;
select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                |     1 |    76 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES      |     1 |    76 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | MGR_ID_NVL_IDX |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Voila - null defeated:)

Method 3
The third method, although it works, will not be suitable for every system. The method uses a bitmap index. The bitmap index is less fussy than B * Tree and indexes everything as it goes, including nulls. 
create bitmap index mgr_id_btm_idx on employees (manager_id);

EXPLAIN PLAN FOR select * from employees where manager_id is null;

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                |     1 |    76 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES      |     1 |    76 |     1   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |                |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE        | MGR_ID_BTM_IDX |       |       |            |          |


Also here we see the use of the index for searching.
Null is considered as  embarrassing, inefficient. But in the skilled hands will behave meekly and will not cause any problems!

Komentarze