How to check if a row is a continuation of the previous one - part 2.


Some time ago, in 2019, I wrote a post with a query that solved a quite common problem - how to recognize in an SQL query whether rows in a table are a continuation of the previous row.

What is it actually about?

For example, let's look at an employee with employee_id = 101 and check his employment history (all examples use the HR scheme).
 SELECT * FROM hr.job_history   
 WHERE  employee_id in (101) ;   

 EMPLOYEE_ID START_DA END_DATE  JOB_ID       DEPARTMENT_ID  
 ----------- -------- --------  ----------  -------------  
     101     97/09/21 01/10/27  AC_ACCOUNT  110  
     101     01/10/28 05/03/15  AC_MGR      110  
 
In 2001 Ms. Kochhar changed her position from AC_ACCOUNT to AC_MGR. Nevertheless, she was still employed. We would like to find the dates of employment of employees from - to, regardless of whether they changed positions or departments in the meantime or not.

In 2019, I proposed a way, on an older version of Oracle, how to solve this problem using analytical functions. The post can be found HERE

However, in Oracle 12c there is something that can significantly simplify this type of analysis - MATCH_RECOGNIZE. I decided to refresh that entry and try to solve this problem using the MATCH_RECOGNIZE function.

Examples based on Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production. I am using the sample HR diagram.

So get to work!

To begin with, I'll add one more record to JOBS_HISTORY. This time Ms. Kochhar will change departments.

 insert into hr.job_history values(101, to_date('20050316', 'yyyymmdd'), sysdate, 'IT_PROG', 60);
 SELECT * FROM hr.job_history 
 where employee_id in (101) ;  
 EMPLOYEE_ID START_DA END_DATE JOB_ID     DEPARTMENT_ID
----------- -------- -------- ---------- -------------
        101 97/09/21 01/10/27 AC_ACCOUNT           110
        101 01/10/28 05/03/15 AC_MGR               110
        101 05/03/16 23/01/16 IT_PROG               60
Now let's see what possibilities  MATCH_RECOGNIZE gives us.
 SELECT employee_id, start_date, end_date,department_id, cls, mno  
  FROM hr.job_history   
   MATCH_RECOGNIZE (  
      PARTITION BY employee_id   
      ORDER BY start_date   
      MEASURES  
        MATCH_NUMBER() AS mno,  
        CLASSIFIER() AS cls  
  ALL ROWS PER MATCH        
  PATTERN (str e+)    
  DEFINE   
   e AS start_date = prev(end_date) +1)  
  order by employee_id, start_date, mno;  
  /  
 
EMPLOYEE_ID START_DA END_DATE DEPARTMENT_ID CLS     MNO
----------- -------- -------- ------------- ----- -----
        101 97/09/21 01/10/27           110 STR       1
        101 01/10/28 05/03/15           110 E         1
        101 05/03/16 23/01/16            60 E         1
        176 06/03/24 06/12/31            80 STR       1
        176 07/01/01 07/12/31            80 E         1
Wow, it works :)

In the query above, we showed only those employees who changed either the position or the department (they have rows with continuation).
But we can also show all rows from the JOBS_HISTORY table, indicating which records have a continuation and which ones don't.

 SELECT employee_id, start_date, end_date,department_id, cls, mno  
  FROM hr.job_history   
   MATCH_RECOGNIZE (  
      PARTITION BY employee_id   
      ORDER BY start_date   
      MEASURES  
        MATCH_NUMBER() AS mno,  
        CLASSIFIER() AS cls  
  ALL ROWS PER MATCH WITH UNMATCHED ROWS   
  PATTERN (str e+)    
  DEFINE   
   e AS start_date = prev(end_date) +1)  
  order by employee_id, start_date, mno;  
  /  
EMPLOYEE_ID START_DA END_DATE DEPARTMENT_ID CLS     MNO
----------- -------- -------- ------------- ----- -----
        101 97/09/21 01/10/27           110 STR       1
        101 01/10/28 05/03/15           110 E         1
        101 05/03/16 23/01/16            60 E         1
        102 01/01/13 06/07/24            60            
        114 06/03/24 07/12/31            50            
        122 07/01/01 07/12/31            50            
        176 06/03/24 06/12/31            80 STR       1
        176 07/01/01 07/12/31            80 E         1
        200 95/09/17 01/06/17            90            
        200 02/07/01 06/12/31            90            
        201 04/02/17 07/12/19            20            

11 rows selected. 
And we can also show results aggregated to one row per employee.

 SELECT employee_id, start_date, end_date,department_id  
  FROM hr.job_history   
   MATCH_RECOGNIZE (  
      PARTITION BY employee_id   
      ORDER BY start_date   
      MEASURES  
        start_date as start_date,  
        end_Date as end_date,   
        department_id as department_id,  
        MATCH_NUMBER() AS mno,  
        CLASSIFIER() AS cls  
  ONE ROW PER MATCH   
  PATTERN (str e+)    
  DEFINE   
   e AS start_date = prev(end_date) +1)  
  order by employee_id, start_date, mno;  
  /  
 EMPLOYEE_ID START_DA END_DATE DEPARTMENT_ID
----------- -------- -------- -------------
        101 05/03/16 23/01/16            60
        176 07/01/01 07/12/31            80  
The MATCH_RECOGNIZE functionality gives amazing possibilities! I admit that the beginning can be difficult. But it's enough to make a few examples - at some point, everything will fall into place in your head into a sensible whole and then you will open up new, huge possibilities for analyzing data sets in an almost unlimited way and without resorting to SQL "tricks"!

How to start?
Take a look at the sources below to get started:


Powodzenia :)

Komentarze