Who will win: man – the crown of creation or backslash?

Automatic conversion tools are often responsible for the bulk of the migration of information systems from Oracle to PostgreSQL. But among the gigantic amount of code covered by such solutions, there are also exceptional stories that you have to improvise with. In such cases, the first step, of course, is to localize the problem, establish its cause and, in order to find the right solution, think about whether there are any similar real or fictional situations that could have a similar nature. After that, you usually have to refactor the original Oracle code, modify the conversion processes and grammar, or implement the unparalleled Oracle functionality in PostgreSQL. Once we were challenged by a seemingly primitive request with an error, which required an entire investigation to solve.

Screensaver

The story begins rather trivially – the problem arises that no data is printed in the report at all. We reproduce the problematic test case, we get the following error:

ERROR:  syntax error at or near ":"
LINE X:  WHERE strpos((concat(concat(';', (:ID)::varchar), ';'...
                                           ^

Level 0

Let’s take a closer look at the part of the request with an error. Circumflex clearly indicates :ID, but intuition suggests that this is still a consequence, not the cause of the error – for a start it is better to look around. Construction with instrturned into a structure with strpos, occurs on the project quite often, and in itself should not cause problems. FROM concat, perhaps, everything should also be in order – the function is in both considered DBMS, and is used instead of the operator ||because it does an excellent job with NULL-values. It is very likely that the developer, when implementing this particular report, made some kind of mistake when binding variables or in the request itself. But even there, surprisingly, everything turned out to be smooth. Then we will look for the problematic part in the simplest way – we delete one by one the parts of the request and the bind variables, gradually narrowing the circle of “suspects”. After several iterations, the shortened problematic version of the original Oracle query looked something like this:

select '' || t.SOME_FIELD || '' SLASHED_FIELD
  from SOME_TABLE t
 where instr(';' || :ID || ';', ';' || t.ID || ';') > 0

It is converted into the following query for PostgreSQL:

SELECT concat(concat('', t.some_field), '')  "slashed_field"
  FROM some_table t
 WHERE strpos((concat(concat(';', (:ID)::varchar), ';')), (concat(concat(';', (t.id)::varchar), ';'))))::numeric > 0

There were no problems at the level of the reporting form, both requests are valid for the respective DBMS. The error is not repeated when executing a request from popular IDEs or using psqlif you set a valid value for the variable :ID… But why is there a syntax error, and not, for example, a warning about the wrong number / type of bind variables? Through much experimentation, it turns out that a special combination of a backslash string literal, concatenation, and a bind variable is needed to reproduce the error.

ChangeResult
move the expression with the bind variable to different parts selectmistake
rename the bind variablemistake
trying different values ​​of the bind variablemistake
“Collapse” nested calls of the concatenation function into one call with all the arguments at oncemistake
instead of the concatenation function, we try to use the concatenation operatormistake
instead of the concatenation function, we try to use other string functionsthe request works, but the result is not what we want
change literal content to random text, double / quad backslash, escaped control constructsthe request works, but the result is not what we want
change the order of the arguments in the concatenation functionthe request works, but the result is not what we want

After these observations, we get a really minimal problematic example:

select concat('', concat(:ID::varchar, '')) SLASHED_ID

At this stage, intuition suggests that, most likely, the matter is in the wrong escaping or, conversely, its absence somewhere on the lower layers of abstraction, as a result of which an invalid request is sent to the DBMS for execution. Let’s check this by looking at the PostgreSQL logs.

# Логи запроса, прошедшего по оригинальному пути
ERROR:  syntax error at or near ":" at character 20
STATEMENT:  select concat('', :ID::varchar, '') SLASHED_ID;

# Логи того же запроса при вызове из psql с предварительным set
LOG:  statement: select concat('', 12345678::varchar, '') SLASHED_ID;
LOG:  duration: 0.936 ms

When a request is called from an application, the value of the binding variable is not substituted for some reason. It’s time to go down to the next level of abstraction, what if the problem is in the application core?

Level 1

The framework is written in pure PHP, it has its own facade for working with a DBMS. The under-the-hood version for Oracle uses oci8, and for PostgreSQL – quite familiar PDO… A typical scenario for working with a facade class (let’s call it Query) looks simple:

  1. the request text is passed to the constructor, and its preparation is performed there (PDO::prepare());
  2. for each bind variable the method is called Query->Bind()which stores the names, values ​​and types of variables into a private array for substitution just before execution;
  3. function is called Query->Execute()which performs pre-expansion of values ​​with PDOStatement::bindParam()and then runs PDOStatement::execute()
  4. the returned tuples are processed using different methods if necessary Query->Fetch*()

Debag very quickly showed that everything is fine with the first two steps, but on the third, where the substitution occurs, the following situation arises. Function PDOStatement::bindParam() returns a boolean value as a substitution result, in our case it is FALSE… IN documentation and comments there was no information to her on this matter, only stingy Возвращает TRUE в случае успешного завершения или FALSE в случае возникновения ошибки… Established funds PDO to diagnose errors, they also do not want to share information about why the variable could not be substituted. IN PDO::errorInfo() information about the error already known to us:

array(3) {
  [0]=>
  string(5) "42601"
  [1]=>
  int(7)
  [2]=>
  string(136) "ERROR:  syntax error at or near ":"
LINE 1: select concat('', concat(:ID::varchar, '')) SLASHED_ID
                                  ^"
}

IN PDO::debugDumpParams() nothing new too – just the actual request text and the number of parameters:

SQL: [56] select concat('', concat(:ID::varchar, '')) SLASHED_ID
Params:  0

Level 2

In order to with complete confidence exclude the influence of the environment and the framework, we go to the next level – we use our minimal request from pure PDO:

prepare("select concat('\', concat(:ID::varchar, '\')) SLASHED_ID");
    if (!$stmt) {
        print "Statement preparation has failedn";
    } else {
        $value="12345678";
        if ($stmt->bindParam(':ID', $value)) {
            print "Bound :ID with value of {$value}n";
        } else {
            print "Bind attempt for :ID with value of {$value} has failedn";
        }
        if ($stmt->execute()) {
            print "Query successfully executedn";
        } else {
            $info = $stmt->errorInfo();
            print "Query execution has failed, reason: {$info[2]}nDebug dump: ";
            $stmt->debugDumpParams();
            print "n";
        }
    }
} else {
    print "Connection attempt to {$connectionString} has failed, check the CONNECTION_STRING environment variablen";
}

As a result, we learn the only news – the framework, it seems, is not to blame. Otherwise, everything is the same: the value was not substituted and the request ended with a syntax error:

Bind attempt for :ID with value of 12345678 has failed
Query execution has failed, reason: ERROR:  syntax error at or near ":"
LINE 1: select concat('', concat(:ID::varchar, '')) SLASHED_ID
                                  ^
Debug dump: SQL: [56] select concat('', concat(:ID::varchar, '')) SLASHED_ID
Params:  0

IN PDO there are two modes for preparing expressions: native preparation in the driver and emulation in the PDO… The mode depends on the connection attribute named ATTR_EMULATE_PREPARES… IN documentation there is no information about the default value. You can already smell victory: there must be some kind of bug in the emulation mode, due to which there is no binding. Well, let’s try both explicitly and see if anything changes. By indicating FALSE, we get exactly the same result as in the example above, but with TRUE – something new:

Bound :ID with value of 12345678
Query execution has failed, reason: ERROR:  syntax error at or near ":"
LINE 1: select concat('', concat(:ID::varchar, '')) SLASHED_ID
                                  ^
Debug dump: SQL: [56] select concat('', concat(:ID::varchar, '')) SLASHED_ID
Params:  1
Key: Name: [3] :ID
paramno=-1
name=[3] ":ID"
is_param=1
param_type=2

The parameter was allegedly used, but still it was not substituted, and therefore, unfortunately, this did not help the request. Remembering how to bind variables in PDO and try them in turn:

WayResult
PDOStatement :: bindParam ()[42601]: syntax error at or near “:”
PDOStatement :: bindValue ()[42601]: syntax error at or near “:”
PDOStatement :: execute ($ parameters)[HY093]: Invalid parameter number

The result is slightly different, but with the same meaning: all 6 combinations of these methods with different modes ATTR_EMULATE_PREPARES fail. Explicit indication of the size and type of a variable in those methods that allow it also does not affect anything.

Level 3

Trying to execute the request using libpq directly, hoping to get more information about the error if it occurs. It is worth noting here that by itself libpq does not provide facilities for working with named parameters, only with positional ones. This means that if such an opportunity exists in your environment / language, you should thank the author for such a convenient tool.

#include 
#include 
#include 
#include 

static void graceful_failure(PGconn * conn) {
    PQfinish(conn);
    exit(1);
}

int main(int argc, char ** argv) {
    const char * conninfo;
    const char * stmtName = "TEST_STATEMENT";
    PGconn * conn;
    PGresult * res;

    if (argc > 1) {
        conninfo = argv[1];
    } else {
        fprintf(stderr, "Please provide a connection string as the first argument");
    }

    conn = PQconnectdb(conninfo);

    if (PQstatus(conn) != CONNECTION_OK) {
        fprintf(stderr, "Connection to database failed: %s",
            PQerrorMessage(conn));
        graceful_failure(conn);
    }

    res = PQprepare(
        conn,
        stmtName,
        "select concat('\', $1::varchar, '\') SLASHED_ID",
        1,
        NULL
    );

    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        fprintf(stderr, "Statement preparation has failed: %s", PQerrorMessage(conn));
        PQclear(res);
        graceful_failure(conn);
    }

    const char * paramValues[1];
    int paramLengths[1];

    paramValues[0] = "12345678";
    paramLengths[0] = strlen(paramValues[0]);

    res = PQexecPrepared(conn,
        stmtName,
        1,
        paramValues,
        paramLengths,
        NULL,
        0
    );

    if (PQresultStatus(res) != PGRES_TUPLES_OK) {
        fprintf(stderr, "Query execution has failed: %s", PQerrorMessage(conn));
        PQclear(res);
        graceful_failure(conn);
    } else {
        fprintf(stdout, "Got the following result: %s", PQgetvalue(res, 0, 0));
    }

    PQclear(res);
    PQfinish(conn);

    return 0;
}

We compile, run and realize that there is no error:

$ gcc libpqtest.c -I /usr/include/postgresql -o libpqtest -lpq && ./libpqtest "$CONNECTION_STRING"

Got the following result: 12345678

We can conclude that the problem is still somewhere in PDO

Calling a friend

There is another scripting high-level programming language that is always at hand. Why not check if such a query works in it out of interest? Suddenly the problem is not PDO, and in the version or implementation of the protocol that is provided libpq?

import os
import psycopg2
conn = psycopg2.connect(os.getenv("CONNECTION_STRING"))
cursor = conn.cursor()

cursor.execute("select concat('\', %(ID)s, '\') SLASHED_ID", {"ID": "12345678"})
for row in cursor:
    print(row)

cursor.close()
conn.close()

And here it worked too:

('\12345678\',)

And again level 2

In fact, it was possible to finish playing at the second level, but I realized this shortly before the article was published. To be sure that the problem is PDO, and not somewhere deeper, it was enough to use some alternative way to work with PostgreSQL from PHP, for example, the good old extension pgsql… It does not provide separate functions for binding parameters and only supports positional pseudo-variables:

In response, we will see a simple and clear message Query successfully executed, after which there will be no doubt about the location of the problem.

Final scene

Dealing with such problems is, of course, difficult and interesting, but the client needs a working report. Let's patch it up with a temporary solution: out of many options (for example, the obvious removal of the concatenation from the request to another layer of the application), we will choose the simplest one - we will replace the string literal with a call to a native function that will return the desired character. To do this, we find out the ASCII code of the desired character using the function of the same name, and we obtain the character from the code using the function chr:

select chr(92) || t.SOME_FIELD || chr(92) SLASHED_FIELD
  from SOME_TABLE t
 where instr(';' || :ID || ';', ';' || t.ID || ';') > 0

Hooray, everything works! But that's not all: you need to check if such problematic constructions are found somewhere else, although this is extremely unlikely. And fortunately, a careful search of the project did not reveal more potential candidates for a similar error.

Titles

We can say that friendship has won the fight against backslash. Observations will become a separate task in the backlog, which we will return to one day. Surely, the accumulated information is already enough to report a bug in PHP, but on https://bugs.php.net a lot of bugs in the Open status, and the solution will have to wait for a very long time, given that there are simple workarounds, and the conditions for playback are very rare and specific. Despite this, it is worth reporting there anyway.

To summarize, here are a couple of reminders about things that might have made this article absent:

  • Provide detailed error information corresponding to the set logging level - just FALSE as a result or message in the spirit "Произошла ошибка" by themselves will not help a person who has encountered an error;
  • Participating in open source projects is a great way to help thousands of colleagues and millions of users, as well as improve your skills, broaden your horizons, and make yourself known in the professional community. Now is the right time to start - it's already Preptemberfollowed by Hacktoberfest...

Information for those wishing to join

It is customary to attach information about the environment to such publications; in this case, two options were tried:

  • PHP 7.3.18, libpq-dev 10.12;
  • PHP 7.4.5, libpq-dev 11.7.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *