Loading data - SQL vs FORALL perf tests


The last few posts focused on loading data using SQL with LOG ERRORS.


However, is it worth loading data in this way - will it be faster? How  the LOG ERRORS clause affects the execution time of INSERTS? How does it compare to PL/SQL FORALL?

To satisfy curiosity (mainly mine :) I performed 10 tests. I compared loading different volumes of data with different methods: IAS, IAS + LOG ERRORS, FORALL. In addition, I checked loading all correct records and incorect ones.

I conducted tests on the Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production on a home laptop: D

Test details:
Tests were run on the HR scheme.

The tests consisted of loading the multiplied data from the HR.EMPLOYEES table into the newly created myEmp_<test_number> table, structure identical to HR.EMPLOYEES (except for the employee_id field, where I changed the field type from number(6) to number.

An ERR$_<table name> table was created for each table using the dbms_errlog.create_error_log function.

All newly created tables have statistics calculated.

Error Simulation:
For the purposes of error simulation, the length of the last_name field has been reduced to two characters in appropriate tests.

Detailed description of the tests:

TEST 1: IAS, OK rows.
Loading data using the Insert as select method. All records loaded were correct and were loaded into the target table.

TEST 2: IAS + log errors, OK rows
Loading data using the Insert as select method with the LOG ERRORS clause. All records were correct and were loaded into the target table.

TEST 3: IAS + log errors, NOK rows
Loading data using the Insert as select method with the LOG ERRORS clause. All records loaded were incorrect and were registered in the ERR$_<table name> table.

TEST 4: FORALL + save exeptions, ok rows
Loading data using the  BULK COLLECT  i FORALL with z save exceptions method. All records loaded were correct and were loaded into the target table.

TEST 5: IAS + log errors counting rows (PLSQL variable), OK rows
Loading data using the Insert as select method with the LOG ERRORS clause. Additionally, records were counted using global PL/SQL variables (Loading data - SQL: is there any error? [var].All records were correct and were loaded into the target table.

TEST 6: FORALL + save exeptions, nok rows
Loading data using the  BULK COLLECT  i FORALL with z save exceptions method. All records loaded wer e incorrect and were registered in the ERR$_<table name> table.

TEST 7: IAS+LOG errors , counting rows (triggers), ok rows
Loading data using the Insert as select method with the LOG ERRORS clause. Additionally, records were counted using global PL/SQL variables with triggers (described on Jacek's blog https://www.oraclethoughts.com/sql/insert-log-errors-and-sqlrowcount/). All records were correct and were loaded into the target table.

TEST 8: IAS + log errors counting rows (triggers), NOK rows
Loading data using the Insert as select method with the LOG ERRORS clause. Additionally, records were counted using global PL/SQL variables with triggers (described on Jacek's blog https://www.oraclethoughts.com/sql/insert-log-errors-and-sqlrowcount/). All records loaded wer e incorrect and were registered in the .

TEST 9: IAS + log error, counting rows (plsql variable) , NOK rows
Loading data using the Insert as select method with the LOG ERRORS clause. Additionally, records were counted using global PL/SQL variables (Loading data - SQL: is there any error? [var]. All records loaded wer e incorrect and were registered in the ERR$_<table name> table.

TEST 10: IAS + log error, counting rows (gtt) , NOK rows
Loading data using the Insert as select method with the LOG ERRORS clause. Additionally, records were counted using global Global Temporary Table  (Loading data - SQL: is there any error? [GTT, tab]. All records loaded wer e incorrect and were registered in the ERR$_<table name> table.

Uf, this has gathered a little bit. But if we want to have a clear picture, we need to check everything. To be honest - the test results surprised me. I knew they could surprise me - and surprised me - but from a completely different angle!

Curious?

The first table shows data loading times for tests 1,2,4,5 and 7 - that is, all tests with correct data.

Green - the fastest test
Red - the slowest test
The % column shows how many times slower the test was compared to the fastest test (green),

Of course, the leader is plain IAS, not worrying about any mistakes at all.
Adding the LOG ERRORS clause to IAS - causes a three times slower.
IAS + LOG ERRORS and FORALL -> amazingly close together - but almost 3 times slower than without logging errors.
The rate closes with IAS + LOG ERRORS tests with additional counting of erroneous records - adding such a function almost double the time!

And what will the situation look like if all records are incorrect?

Let's look:


The fastest - significantly - FORALL !!!!
Other methods quite close together, IAS + LOG ERRORS with record count slightly slower than the version without- however, all tests are almost twice as slow as FORALL!

It is also interesting how much the impact of a large number of incorrect records affects performance.


In the case of IAS + LOG ERRORS - a large number (in this case all) of errors increases the processing time more than eight times!


FORALL not much better - a sixfold slowdown.

Let's look at the comparison of IAS + LOG ERRORS and FORALL for the correct records:

IAS is 20% slower. Is that a lot?

Summary:
Which method is better? The one that matches your data. For me, tests gave such results, but will yours be the same? 

However, if we wanted to draw conclusions based on the above tests, one could risk that if we are sure that the data are correct, the common IAS will be by far the fastest. And if errors can occur - then it is worth choosing FORALL, which is not much slower than IAS, but with a large number of errors is more efficient, and naturally we have more control over error processing and logging.

Are you surprised by the test results?
Me very much. I expected that adding error logging would slow down data loading. But I did not think for a second that FORALL would be so close to the IAS results - and even in the extreme case of all erroneous records - even much faster! Until now, I followed the principle - what can you do in SQL - do it in SQL. But now I would change it to - what can you do in SQL - check the times first : D

code:

Komentarze