LATERAL: hide and seek




Schemat: HR
Wersja:Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production

Lateral – a specific join introduced in Oracle 12c.
This join allows combining two tables in a correlated way – meaning that the subquery on the right-hand side of the join is executed for each row of the table on the left-hand side.

Let’s write a query using JOIN LATERAL that returns the department name and the last name of the employee earning the highest salary in that department.
(Yes, I know this query can be written in many different ways without using LATERAL. But still – let’s try this approach.)

Query 1.

select department_name, q.last_name
	from   departments d 
       join lateral (select last_name, email
                     from   employees e 
                     where  e.department_id = d.department_id 
                     order by salary asc 
                     fetch first rows with ties
                     ) q
       on  1=1 
order by 1, 2;

DEPARTMENT_NAME                LAST_NAME                
------------------------------ -------------------------
Accounting                     Gietz                    
Administration                 Whalen                   
Executive                      De Haan                  
Executive                      Kochhar                  
Finance                        Popp                     
Human Resources                Mavris                   
IT                             Lorentz                  
Marketing                      Fay                      
Public Relations               Baer                     
Purchasing                     Colmenares               
Sales                          Kumar                    
Shipping                       Olson      


Thanks to the LATERAL join, for each row from the DEPARTMENTS table we executed a top-N query that finds the employee earning the highest salary.
The highest? But how much is “the highest”?
Let’s add the SALARY column to the query.

Query 2.

select department_name, q.last_name, salary
	from   departments d 
       join lateral (select last_name, email
                     from   employees e 
                     where  e.department_id = d.department_id 
                     order by salary asc 
                     fetch first rows with ties
                     ) q
       on  1=1 
order by 1, 2;

ORA-00904: "SALARY": invalid identifier https://docs.oracle.com/error-help/db/ora-00904/
The identifier or column name entered was invalid
Error at Line: 1 Column: 36

Of course – an error.
In the subquery, we didn’t include the SALARY column, so obviously we cannot display this column in the main query.

Let’s add the column to the subquery:

Query 3.

select department_name, q.last_name, salary
	from   departments d 
       join lateral (select last_name, email, salary
                     from   employees e 
                     where  e.department_id = d.department_id 
                     order by salary asc 
                     fetch first rows with ties
                     ) q
       on  1=1 
order by 1, 2;

DEPARTMENT_NAME                LAST_NAME                     SALARY
------------------------------ ------------------------- ----------
Accounting                     Gietz                           8300
Administration                 Whalen                          4400
Executive                      De Haan                        17000
Executive                      Kochhar                        17000
Finance                        Popp                            6900
Human Resources                Mavris                          6500
IT                             Lorentz                         4200
Marketing                      Fay                             6000
Public Relations               Baer                           10000
Purchasing                     Colmenares                      2500
Sales                          Kumar                           6100
Shipping                       Olson                           2100

Bravo! 🎉

Now let’s change the subquery to something else.

Let’s find the departments and the employees in those departments who earn more than 15,000.

Query 4.

select department_name, q.last_name  
	from   departments d 
       join lateral (select last_name, email, salary
                     from   employees e 
                     where  e.department_id = d.department_id 
                     and salary > 15000
                     order by salary asc 
                     ) q
       on  1=1 
order by 1, 2;

DEPARTMENT_NAME                LAST_NAME                
------------------------------ -------------------------
Executive                      De Haan                  
Executive                      King                     
Executive                      Kochhar                 

Oops, we forgot the SALARY column.

Query 5.

select department_name, q.last_name, salary
	from   departments d 
       join lateral (select last_name, email
                     from   employees e 
                     where  e.department_id = d.department_id 
                     and salary > 15000
                     order by salary asc 
                   ) q
       on  1=1 
order by 1, 2;

DEPARTMENT_NAME                LAST_NAME                     SALARY
------------------------------ ------------------------- ----------
Executive                      De Haan                        17000
Executive                      King                           24000
Executive                      Kochhar                        17000


It worked!
But wait... we didn’t add the SALARY column to the subquery on the EMPLOYEES table. So where did the query get the data for the SALARY column? After all, neither the subquery nor the DEPARTMENTS table contains a SALARY column!!!???


👇

👇

👇

👇

👇

👇

👇

👇

👇

👇

👇

👇

👇

👇

The whole mystery is explained by the optimizer and its ability to transform an SQL query.
We can view the query after transformation in trace 10053.
Let’s take a look at the trace for the first query:

begin

        SYS.DBMS_SQLDIAG.DUMP_TRACE(p_sql_id =>  'c5vgpaj0zrdcp',
                p_child_number => 0, 
                p_component => 'Compiler', 
                p_file_id => 'sql1');
        
end;
/


SELECT "D"."DEPARTMENT_NAME" "DEPARTMENT_NAME",
       "VW_LAT_2D0B8FC8"."LAST_NAME_0" "LAST_NAME" 
FROM "HR"."DEPARTMENTS" "D", 
LATERAL( (SELECT "from$_subquery$_004"."LAST_NAME" "LAST_NAME_0" 
        FROM  
            (SELECT "E"."LAST_NAME" "LAST_NAME",
                    "E"."EMAIL" "EMAIL",
                     "E"."SALARY" "rowlimit_$_0",RANK() OVER 
             ( ORDER BY "E"."SALARY") "rowlimit_$$_rank" 
        FROM "HR"."EMPLOYEES" "E" 
        WHERE "E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID") "from$_subquery$_004"
 WHERE "from$_subquery$_004"."rowlimit_$$_rank"<=1)) "VW_LAT_2D0B8FC8" 
ORDER BY "D"."DEPARTMENT_NAME","VW_LAT_2D0B8FC8"."LAST_NAME_0";

If we analyze the query, we see that to display SALARY in the main query, we must also add this column to the SELECT list in the subquery fetching data from the EMPLOYEES table.

Now let’s take a look at the trace for Query 5:


begin

        SYS.DBMS_SQLDIAG.DUMP_TRACE(p_sql_id =>  'fm2dqgt82jtv8',
                p_child_number => 0, 
                p_component => 'Compiler', 
                p_file_id => 'sql5');
        
end;
/

SELECT "D"."DEPARTMENT_NAME" "DEPARTMENT_NAME",
        "E"."LAST_NAME" "LAST_NAME",
        "E"."SALARY" "SALARY" 
FROM "HR"."DEPARTMENTS" "D","HR"."EMPLOYEES" "E" 
WHERE "E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID" 
        AND "E"."SALARY">15000 
ORDER BY "D"."DEPARTMENT_NAME","E"."LAST_NAME";

Oracle sneakily transformed the query with the LATERAL join into a regular JOIN between the two tables.
With this form, we can indeed display the SALARY column from the EMPLOYEES table, because it is no longer in the subquery but in the JOIN, giving us direct access to all tables involved.





Komentarze