How to compare/range-sort text values?




Comparing numerical data is usually simple and intuitive. It’s well-known—10 is always greater than 1 (although since I’ve been interested in quantum mechanics, I’m not so sure anymore). However, in the world of databases, when it comes to comparing text values, things can get complicated.

Comparing Text Values

Let’s consider how to properly compare text. For example, is the letter "Ą" from the Polish alphabet between "A" and "C"? (For those unfamiliar with the Polish alphabet—spoiler: yes, it is! 😊)

I prepared the demo based on:

Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free

Without the appropriate settings, comparing texts can lead to unexpected results. For instance, some characters may be sorted in the wrong order, causing database queries to return results that are not as expected.

DECLARE
v1 varchar2(2) := 'Ą';

v2 varchar2(1) := 'A';
v3 varchar2(1) := 'C';

begin

if v1 between v2 and v3   then 
    DBMS_OUTPUT.PUT_LINE('ok');
else 
 DBMS_OUTPUT.PUT_LINE('nok');
end if;

end;
/

NOK

It turns out that with the default settings, Oracle sorts in a slightly different way than we would expect.

Locale Settings

To ensure that the comparison of text values follows local rules, you need to set the appropriate locale parameters. For example, to correctly compare Polish characters, you can use the following SQL commands:

alter session set nls_language = 'POLISH';

NOK

Unfortunately, after running our test, the result is still NOK. Setting the language is therefore not the solution we are looking for. 

Sort parameters

To achieve correct text sorting, you need to set the NLS_SORT and NLS_COMP parameters:

alter session set nls_sort = 'POLISH'; 
alter session set nls_comp = 'LINGUISTIC';

OK - Success! We finally achieve the sorting we were aiming for :)

Another solution is to specify the NLS_SORT settings directly in the code:

DECLARE
v1 varchar2(2) := 'Ą';

v2 varchar2(1) := 'A';
v3 varchar2(1) := 'C';

begin

  if NLSSORT(v1, 'NLS_SORT=POLISH')  between NLSSORT(v2, 'NLS_SORT=POLISH')  and NLSSORT(v3, 'NLS_SORT=POLISH')  then 
    DBMS_OUTPUT.PUT_LINE('ok');
else 
 DBMS_OUTPUT.PUT_LINE('nok');
end if;

end;
/

We can check the current parameter settings using the query below:

select * from nls_session_parameters
where parameter in ('NLS_SORT', 'NLS_COMP', 'NLS_LANGUAGE');

PARAMETER                      VALUE                                                           
--------------------------- -------------------
NLS_LANGUAGE                   POLISH                                                          
NLS_SORT                       POLISH                                                          
NLS_COMP                       LINGUISTIC

Parameters: NLS_COMP: defines the collation behavior

  • binary - binary sort [default]
  • linguistic - linguistic sort
  • ansi - backward compatibility

NLS_SORT: defines the sort order [binary/linguistic]


Komentarze