Same SQL - different explain plan? - solution



One query - two different execution plans? How it is even possible?
One query was run in an anonymous block, the other directly from SQL Developer.


I conducted the tests on the basis of  Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production.

EMP_PUZZLE table - the copy of the EMPLOYEES table from HR schema. 

Test 1: 
EXPLAIN PLAN FOR 
    SELECT * FROM emp_puzzle 
    WHERE employee_id = :Emp_Id;

SELECT * FROM table(DBMS_XPLAN.DISPLAY);
EXPLAIN PLAN:
Plan hash value: 2996013062
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                           |     1 |    72 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP_PUZZLE                |     1 |    72 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMPLOYEE_ID_EMP_PUZZLE_PK |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("EMPLOYEE_ID"=:EMP_ID)
DECLARE
nEmp_id number;
recEmp_puzzle emp_puzzle%rowtype ;
begin
 
    nEmp_id := 107;
    SELECT * into recEmp_puzzle FROM emp_puzzle where employee_id = nEmp_id;
end;
/
Fragment of the trace:

SQL ID: cpzb79fuy3a0s Plan Hash: 1175338861 SELECT * FROM EMP_PUZZLE WHERE EMPLOYEE_ID = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.00 0.00 0 6 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 0.00 0.00 0 6 0 1 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 102 (HR) (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 TABLE ACCESS FULL EMP_PUZZLE (cr=6 pr=0 pw=0 time=78 us starts=1 cost=3 size=72 card=1)

* * * SOLUTION* * *

I was led to find the solution to the puzzle by launching and analyzing another version of the query.

 Test 3:

EXPLAIN PLAN FOR 
    SELECT * FROM emp_puzzle 
    WHERE employee_id = '107';

SELECT * FROM table(DBMS_XPLAN.DISPLAY);
Plan hash value: 2996013062
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                           |     1 |    72 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP_PUZZLE                |     1 |    72 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMPLOYEE_ID_EMP_PUZZLE_PK |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("EMPLOYEE_ID"='107')

This result surprised me completely. Why does the query use the index when searching for a VARCHAR2 value, and not for a variable of the NUMBER type? After all, the EMPLOYEE_ID column on which the main key (PRIMARY KEY EMPLOYEE_ID_EMP_PUZZLE_PK) was created is the NUMBER type? Like all other tables on the system? RIght?
Right???
RIGHT??????

 CREATE TABLE HR.EMP_PUZZLE
   (EMPLOYEE_ID VARCHAR2(10 BYTE), 
	FIRST_NAME VARCHAR2(20 BYTE), 
	LAST_NAME VARCHAR2(25 BYTE) NOT NULL ENABLE, 
	EMAIL VARCHAR2(25 BYTE) NOT NULL ENABLE, 
	PHONE_NUMBER VARCHAR2(20 BYTE), 
	HIRE_DATE DATE NOT NULL ENABLE, 
	JOB_ID VARCHAR2(10 BYTE) NOT NULL ENABLE, 
	SALARY NUMBER(8,2), 
	COMMISSION_PCT NUMBER(2,2), 
	MANAGER_ID NUMBER(6,0), 
	DEPARTMENT_ID NUMBER(4,0), 
	 CONSTRAINT EMPLOYEE_ID_EMP_PUZZLE_PK PRIMARY KEY (EMPLOYEE_ID)
	 )
 ;

Even though the EMPLOYEE_ID column was NUMBER throughout the schema, someone in this one table probably unintentionally set the column type to VARCHAR2. PL / SQL procedures used the column as if the column had NUMBER values. This resulted in an implicit conversion and huge performance hassle! Additionally, this bug was very difficult to find. Everyone was convinced that the column was of the NUMBER type (as it was in the whole system), that it was not even verified. What is more the EXPLAIN PLAN showed that the query uses the index (TEST 1). So they searched for something more complex....

TEST 3, however, led me to the hint. It turned out that EXPLAIN PLAN, if it has a variable given instead of a literal, creates a query plan assuming that the variable is of the "correct" type, which in this case is VARCHAR2 (TEST 1). Only after substituting true values ​​for the query (107, '107') TEST 2 and TEST 3 - the real cause of the error was found.
The problem was diagnosed (who did it? :D) and fixed. Processing time, estimated over 50 hours - was drastically reduced and processing was completed within approximately 3 hours of the bug being fixed.

Sometimes big performance issues aren't caused by complex database issues. Sometimes big performance problems are caused by simple, even banal errors: wrong column type, implicit conversion and many more .....


Scripts:
EMP_PUZZLE_DDL.SQL

Komentarze