PostgreSQL Antipatterns: calculating conditions in SQL

SQL is not C ++, nor JavaScript. Therefore, the calculation of logical expressions is different, and this is not the same thing:

WHERE fncondX() AND fncondY()

= fncondX() && fncondY()

While optimizing your PostgreSQL query execution plan can arbitrarily “rearrange” equivalent conditions, do not calculate any of them for individual records, relate to the condition of the index used … In short, it is easiest to assume that you are in advance can’t control in what order will (and whether at all) be calculated equitable terms.

Therefore, if you still want to manage the priority, you need to structurally make these conditions unequal using conditional expressions and operators.

Data and working with them is the basis our VLSI complex, therefore, it is very important for us that operations on them be performed not only correctly, but also efficiently. Let’s look at specific examples where expression computation errors can be made, and where their efficiency is worth improving.

# 0: RTFM

Starting example from the documentation:

When the calculation order is important, it can be fixed using the construction CASE. For example, such a way to avoid dividing by zero in a sentence WHERE unreliable:

SELECT ... WHERE x > 0 AND y/x > 1.5;

Safe option:

SELECT ... WHERE CASE WHEN x > 0 THEN y/x > 1.5 ELSE false END;

The design used so CASE protects the expression from optimization, so you need to use it only if necessary.

# 1: condition in trigger

BEGIN
  IF cond(NEW.fld) AND EXISTS(SELECT ...) THEN
    ...
  END IF;
  RETURN NEW;
END;

It seems that everything looks good, but … No one promises that invested SELECT will not be executed if the first condition is false. Correct with nested IF:

BEGIN
  IF cond(NEW.fld) THEN
    IF EXISTS(SELECT ...) THEN
      ...
    END IF;
  END IF;
  RETURN NEW;
END;

Now let’s look carefully – the whole body of the trigger function turned out to be “wrapped” in IF. And this means that nothing prevents us from removing this condition from the procedure using WHEN-terms:

BEGIN
  IF EXISTS(SELECT ...) THEN
    ...
  END IF;
  RETURN NEW;
END;
...
CREATE TRIGGER ...
  WHEN cond(NEW.fld);

This approach allows guaranteed saving of server resources under false conditions.

# 2: OR / AND chain

SELECT ... WHERE EXISTS(... A) OR EXISTS(... B)

Otherwise, you can get that both EXISTS will be “true” but both will be executed.

But if we know for sure that one of them is “true” much more often (or “false” – for AND-chains) – is it possible to somehow “increase its priority” so that the second is not performed once again?

It turns out you can – algorithmically, the approach is close to the topic of the PostgreSQL Antipatterns article: a rare record will reach the middle of the JOIN.

Let’s just “put under CASE” both of these conditions:

SELECT ...
WHERE
  CASE
    WHEN EXISTS(... A) THEN TRUE
    WHEN EXISTS(... B) THEN TRUE
  END

In this case, we did not determine ELSE-value, that is, if both conditions are false CASE will return NULLthat is interpreted as FALSE in WHERE-conditions.

This example can be combined in another way – to taste and color:

SELECT ...
WHERE
  CASE
    WHEN NOT EXISTS(... A) THEN EXISTS(... B)
    ELSE TRUE
  END

# 3: how [не] need to write conditions

We spent two days analyzing the reasons for the “strange” trigger of this trigger — let’s see why.

Source:

IF( NEW."Документ_" is null or NEW."Документ_" = (select '"Комплект"'::regclass::oid) or NEW."Документ_" = (select to_regclass('"ДокументПоЗарплате"')::oid)
     AND (   OLD."ДокументНашаОрганизация" <> NEW."ДокументНашаОрганизация"
          OR OLD."Удален" <> NEW."Удален"
          OR OLD."Дата" <> NEW."Дата"
          OR OLD."Время" <> NEW."Время"
          OR OLD."ЛицоСоздал" <> NEW."ЛицоСоздал" ) ) THEN ...

Problem # 1: Inequality Does Not Consider NULL

Imagine that all OLD-fields matter NULL. What will happen?

SELECT NULL <> 1 OR NULL <> 2;
-- NULL

And from the point of view of working out the conditions NULL equivalent to FALSEas mentioned above.

Decision: use the operator IS DISTINCT FROM from ROW-operator comparing whole records at once:

SELECT (NULL, NULL) IS DISTINCT FROM (1, 2);
-- TRUE

Problem number 2: different implementation of the same functionality

Compare:

NEW."Документ_" = (select '"Комплект"'::regclass::oid)
NEW."Документ_" = (select to_regclass('"ДокументПоЗарплате"')::oid)

Why are there extra nesting SELECT? A function to_regclass? And in different ways, why? ..

Fix:

NEW."Документ_" = '"Комплект"'::regclass::oid
NEW."Документ_" = '"ДокументПоЗарплате"'::regclass::oid

Problem # 3: bool operations priority

Format the source:

{... IS NULL} OR
{... Комплект} OR
{... ДокументПоЗарплате} AND
( {... неравенства} )

Oops … In fact, it turned out that in the case of the truth of any of the first two conditions, the whole condition entirely turns into TRUE, without inequalities. And this is not at all what we wanted.

Fix:

(
  {... IS NULL} OR
  {... Комплект} OR
  {... ДокументПоЗарплате}
) AND
( {... неравенства} )

Problem 4 (small): complex OR condition for one field

Actually, problems in No. 3 arose precisely because there were three conditions. But instead of them, you can do one, using the mechanism coalesce ... IN:

coalesce(NEW."Документ_"::text, '') IN ('', '"Комплект"', '"ДокументПоЗарплате"')

So we and NULL “Catch”, and complex OR with brackets do not have to fence.

Total

We fix what we got:

IF (
  coalesce(NEW."Документ_"::text, '') IN ('', '"Комплект"', '"ДокументПоЗарплате"') AND
  (
    OLD."ДокументНашаОрганизация"
  , OLD."Удален"
  , OLD."Дата"
  , OLD."Время"
  , OLD."ЛицоСоздал"
  ) IS DISTINCT FROM (
    NEW."ДокументНашаОрганизация"
  , NEW."Удален"
  , NEW."Дата"
  , NEW."Время"
  , NEW."ЛицоСоздал"
  )
) THEN ...

And given that this trigger function can only be used in UPDATEtrigger due to availability OLD/NEW in the condition of the upper level, then this condition can generally be taken out in WHEN-condition, as was shown in # 1 …

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *