How many indexes you have - indexes and performance.


Do you know how many indexes are in your system?
You can check easily using a dictionary ALL_INDEXES.
select table_name, count(*)  
from all_indexes where owner = 'HR' 
group by table_name order by count(*);
There are usually a few indexes on the table. Sometimes a dozen or so. One system, however, went on record. The tables had several dozen indexes, 30-40 indexes for every table! The record was 44. On one table. Not very big anyway, something like 40 columns. As it turned out, most of the tables in this system had one-column indexes on every column. Plus several multi-column indexes.

What is more, it was a back-office system - these tables were involved in heavy processing. And indexes are not very good for performance! It is no wonder then that the processing took hours - with relatively few records processed! Removing redundant indexes could bring significant benefits!

Unfortunately, the person responsible for this system briefly remarked on my findings:
- Indexes do not affect performance.

Every index must be updated after DML statement. Which logically takes time. On the other hand - I have never actually tested it myself! I have no conclusive proof!

Well, I have to check it, haven't I?

How much the number of indexes affects the time of inserting records to the table?
For the test, I will use the standard HR schema and Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production.

I created the table EMP_IDX_0 as a copy of the EMPLOYEES table. Then I increased the number of columns (I have to create these indexes on something) as a duplicate of the original columns with suffix _1 and updated those columns with data from the original columns. It is my data source for tests.

Then I created two empty tables as a copy of EMP_IDX_0.
EMP_IDX_4 with 4 indexes
EMP_IDX_25 with 25 indexes

For the purposes of the test, I will insert 107,000 records to every table. I think this amount will illustrate the trend.
I inserted data using  BULK and FORALL.

It took eight times longer to load 107,000 records for a table with 25 indexes!
So I finally have my proof :)

To make it more interesting, the system I mentioned did not use BULK COLLECT for processing. The rows where processed ROW BY ROW. Therefore, I also conducted a test where data was inserter row by row.
Let's look.
Twice slower than the BULK COLLECT method. And just like in the first test - more indexes slowed data loading almost five times.

Sum up:

Regardless of the method, a large number of indexes on the table significantly reduces data loading performance. Indexes play an important role in systems, but too many of them can negatively impact processing performance. Of course, it all depends on the system and process architecture, but it's worth remembering when adding another one.

All scripts below:
create table emp_idx_0 as select * from employees;

alter table emp add      
   (EMPLOYEE_ID_1 NUMBER(6,0), 
    FIRST_NAME_1 VARCHAR2(20 BYTE), 
    LAST_NAME_1 VARCHAR2(25 BYTE) ,
    EMAIL_1 VARCHAR2(25 BYTE), 
    PHONE_NUMBER_1 VARCHAR2(20 BYTE), 
    HIRE_DATE_1 DATE, 
    JOB_ID_1 VARCHAR2(10 BYTE) , 
    SALARY_1 NUMBER(8,2), 
    COMMISSION_PCT_1 NUMBER(2,2), 
    MANAGER_ID_1 NUMBER(6,0), 
    DEPARTMENT_ID_1 NUMBER(4,0), 
    SALARY_NEW_1 NUMBER, 
    CHRIST_BONUS_1 NUMBER);

    update emp set 
    EMPLOYEE_ID_1 = EMPLOYEE_ID,
    FIRST_NAME_1 = FIRST_NAME,
    LAST_NAME_1 = LAST_NAME,
    EMAIL_1 = EMAIL,
    PHONE_NUMBER_1 = PHONE_NUMBER,
    HIRE_DATE_1 = HIRE_DATE,
    JOB_ID_1 = JOB_ID,
    SALARY_1 = SALARY,
    COMMISSION_PCT_1 = COMMISSION_PCT,
    MANAGER_ID_1 = MANAGER_ID,
    DEPARTMENT_ID_1 = DEPARTMENT_ID,
    SALARY_NEW_1 = SALARY_NEW,
    CHRIST_BONUS_1 = CHRIST_BONUS;

EXEC DBMS_STATS.gather_table_stats('HR', 'EMP_IDX_0');
create table emp_idx_4  as select * from emp_idx_0 where 1=2  ;

 CREATE UNIQUE INDEX hr.EMP_emp_idx_4_ID_PK ON hr.emp_idx_4 (EMPLOYEE_ID) ;
 CREATE INDEX hr.emp_idx_4_DEPARTMENT_IX ON hr.emp_idx_4 (DEPARTMENT_ID) ;
 CREATE INDEX hr.emp_idx_4_JOB_IX ON hr.emp_idx_4 (JOB_ID) ;
 CREATE INDEX hr.emp_idx_4_NAME_IX ON hr.emp_idx_4 (LAST_NAME, FIRST_NAME) ;
 
 EXEC DBMS_STATS.gather_table_stats('HR', 'EMP_IDX_4');
create table emp_idx_25  as select * from EMP_IDX_0 where 1=2  ;

CREATE UNIQUE INDEX hr.EMP_emp_idx_25_ID_PK ON hr.emp_idx_25 (EMPLOYEE_ID) ;
CREATE INDEX hr.emp_idx_25_FIRST_NAME  ON hr.emp_idx_25 ('FIRST_NAME') ;
CREATE INDEX hr.emp_idx_25_LAST_NAME  ON hr.emp_idx_25 ('LAST_NAME') ;
CREATE INDEX hr.emp_idx_25_EMAIL  ON hr.emp_idx_25 ('EMAIL') ;
CREATE INDEX hr.emp_idx_25_PHONE_NUMBER  ON hr.emp_idx_25 ('PHONE_NUMBER') ;
CREATE INDEX hr.emp_idx_25_HIRE_DATE  ON hr.emp_idx_25 ('HIRE_DATE') ;
CREATE INDEX hr.emp_idx_25_JOB_ID  ON hr.emp_idx_25 ('JOB_ID') ;
CREATE INDEX hr.emp_idx_25_SALARY  ON hr.emp_idx_25 ('SALARY') ;
CREATE INDEX hr.emp_idx_25_COMMISSION_PCT  ON hr.emp_idx_25 ('COMMISSION_PCT') ;
CREATE INDEX hr.emp_idx_25_MANAGER_ID  ON hr.emp_idx_25 ('MANAGER_ID') ;
CREATE INDEX hr.emp_idx_25_DEPARTMENT_ID  ON hr.emp_idx_25 ('DEPARTMENT_ID') ;
CREATE INDEX hr.emp_idx_25_SALARY_NEW  ON hr.emp_idx_25 ('SALARY_NEW') ;
CREATE INDEX hr.emp_idx_25_CHRIST_BONUS  ON hr.emp_idx_25 ('CHRIST_BONUS') ;
CREATE INDEX hr.emp_idx_25_EMPLOYEE_ID_1  ON hr.emp_idx_25 ('EMPLOYEE_ID_1') ;
CREATE INDEX hr.emp_idx_25_FIRST_NAME_1  ON hr.emp_idx_25 ('FIRST_NAME_1') ;
CREATE INDEX hr.emp_idx_25_LAST_NAME_1  ON hr.emp_idx_25 ('LAST_NAME_1') ;
CREATE INDEX hr.emp_idx_25_EMAIL_1  ON hr.emp_idx_25 ('EMAIL_1') ;
CREATE INDEX hr.emp_idx_25_HIRE_DATE_1  ON hr.emp_idx_25 ('HIRE_DATE_1') ;
CREATE INDEX hr.emp_idx_25_JOB_ID_1  ON hr.emp_idx_25 ('JOB_ID_1') ;
CREATE INDEX hr.emp_idx_25_SALARY_1  ON hr.emp_idx_25 ('SALARY_1') ;
CREATE INDEX hr.emp_idx_25_COMMISSION_PCT_1  ON hr.emp_idx_25 ('COMMISSION_PCT_1') ;
CREATE INDEX hr.emp_idx_25_MANAGER_ID_1  ON hr.emp_idx_25 ('MANAGER_ID_1') ;
CREATE INDEX hr.emp_idx_25_DEPARTMENT_ID_1  ON hr.emp_idx_25 ('DEPARTMENT_ID_1') ;
CREATE INDEX hr.emp_idx_25_SALARY_NEW_1  ON hr.emp_idx_25 ('SALARY_NEW_1') ;
CREATE INDEX hr.emp_idx_25_CHRIST_BONUS_1  ON hr.emp_idx_25 ('CHRIST_BONUS_1') ;
CREATE INDEX hr.emp_idx_25_PHONE_NUMBER_1  ON hr.emp_idx_25 ('PHONE_NUMBER_1') ;
 
EXEC DBMS_STATS.gather_table_stats('HR', 'EMP_IDX_25');
set serveroutput on ;
declare
dStart date;
dSTop date;

bulk_errors       exception;
     
pragma exception_init(bulk_errors, -24381); 

cursor cur is 
with data as (select level lvl from dual connect by level <= 1000)
select rownum,
    FIRST_NAME,
    LAST_NAME,
    EMAIL,
    PHONE_NUMBER,
    HIRE_DATE,
    JOB_ID,
    SALARY,
    COMMISSION_PCT,
    MANAGER_ID,
    DEPARTMENT_ID,
    SALARY_NEW,
    CHRIST_BONUS,
    EMPLOYEE_ID_1,
    FIRST_NAME_1,
    LAST_NAME_1,
    EMAIL_1,
    HIRE_DATE_1,
    JOB_ID_1,
    SALARY_1,
    COMMISSION_PCT_1,
    MANAGER_ID_1,
    DEPARTMENT_ID_1,
    SALARY_NEW_1,
    CHRIST_BONUS_1,
    PHONE_NUMBER_1
    from EMP_IDX_0, data ;

type tab is table of cur%rowtype;
t tab;
begin 
dSTart := sysdate;

   open cur;
   fetch cur bulk collect into t;
   close cur;

   forall i in 1..t.count save exceptions 
      insert into emp_idx_4 values t(i);
        
dStop := sysdate;

dbms_output.put_line(round((dstop - dstart) * 24*60*60,2));

exception when bulk_errors then    
    dbms_output.put_line('ERROR');
    dbms_output.put_line(round((nvl(dstop,sysdate) - dstart) * 24*60*60,2));
end;

/
set serveroutput on ;
declare
dStart date;
dSTop date;

bulk_errors       exception;
     
pragma exception_init(bulk_errors, -24381); 

cursor cur is 
with data as (select level lvl from dual connect by level <= 1000)
select rownum,
    FIRST_NAME,
    LAST_NAME,
    EMAIL,
    PHONE_NUMBER,
    HIRE_DATE,
    JOB_ID,
    SALARY,
    COMMISSION_PCT,
    MANAGER_ID,
    DEPARTMENT_ID,
    SALARY_NEW,
    CHRIST_BONUS,
    EMPLOYEE_ID_1,
    FIRST_NAME_1,
    LAST_NAME_1,
    EMAIL_1,
    HIRE_DATE_1,
    JOB_ID_1,
    SALARY_1,
    COMMISSION_PCT_1,
    MANAGER_ID_1,
    DEPARTMENT_ID_1,
    SALARY_NEW_1,
    CHRIST_BONUS_1,
    PHONE_NUMBER_1
    from EMP_IDX_0, data ;

type tab is table of cur%rowtype;
t tab;
begin 
dSTart := sysdate;


   open cur;
   fetch cur bulk collect into t;
   close cur;

   forall i in 1..t.count save exceptions 
      insert into emp_idx_25 values t(i);
        
dStop := sysdate;

dbms_output.put_line(round((dstop - dstart) * 24*60*60,2));

exception when bulk_errors then    
    dbms_output.put_line('ERROR');
    dbms_output.put_line(round((nvl(dstop,sysdate) - dstart) * 24*60*60,2));
end;

/
set serveroutput on ;
declare
dStart date;
dSTop date;


begin 
dSTart := sysdate;

FOR Z IN (with data as (select level lvl from dual connect by level <= 1000)
select rownum,
    FIRST_NAME,
    LAST_NAME,
    EMAIL,
    PHONE_NUMBER,
    HIRE_DATE,
    JOB_ID,
    SALARY,
    COMMISSION_PCT,
    MANAGER_ID,
    DEPARTMENT_ID,
    SALARY_NEW,
    CHRIST_BONUS,
    EMPLOYEE_ID_1,
    FIRST_NAME_1,
    LAST_NAME_1,
    EMAIL_1,
    HIRE_DATE_1,
    JOB_ID_1,
    SALARY_1,
    COMMISSION_PCT_1,
    MANAGER_ID_1,
    DEPARTMENT_ID_1,
    SALARY_NEW_1,
    CHRIST_BONUS_1,
    PHONE_NUMBER_1
    from EMP_IDX_0, data )  loop
       
    insert into emp_idx_4 values z;
end loop;

dStop := sysdate;

dbms_output.put_line(round((dstop - dstart) * 24*60*60,2));

end;
/
set serveroutput on ;
declare
dStart date;
dSTop date;


begin 
dSTart := sysdate;

FOR Z IN (with data as (select level lvl from dual connect by level <= 1000)
select rownum,
    FIRST_NAME,
    LAST_NAME,
    EMAIL,
    PHONE_NUMBER,
    HIRE_DATE,
    JOB_ID,
    SALARY,
    COMMISSION_PCT,
    MANAGER_ID,
    DEPARTMENT_ID,
    SALARY_NEW,
    CHRIST_BONUS,
    EMPLOYEE_ID_1,
    FIRST_NAME_1,
    LAST_NAME_1,
    EMAIL_1,
    HIRE_DATE_1,
    JOB_ID_1,
    SALARY_1,
    COMMISSION_PCT_1,
    MANAGER_ID_1,
    DEPARTMENT_ID_1,
    SALARY_NEW_1,
    CHRIST_BONUS_1,
    PHONE_NUMBER_1
    from EMP_IDX_0, data )  loop
    
    insert into emp_idx_25 values z;
end loop;

dStop := sysdate;

dbms_output.put_line(round((dstop - dstart) * 24*60*60,2));

end;
/

Komentarze