PL language interpreter

Hello! My name is Alexey Kalinin. I'm Oracle PL/SQL-developer This is my first article. At work, 90% of the time is spent writing server code in the language PL/SQL. Good for everyone PL/SQL – simple syntax, implicit type conversion, and support for SQL instructions inside the code for processing data in the database. The only downside is that PL/SQL– the code can only be executed in the Oracle database. I would like there to be a language with a syntax similar to PL/SQLbut it was executed on the client (it was not tied to any database). Would be able to connect to various databases to perform operations with data.

What could this client PL/SQL be used for?

  • In order to be able to work with local files: load data from files directly into the database, upload data from the database to local files.

  • Open in one PL/SQL-code several connections to different databases (Oracle, PostgreSQL, MsSQL, MySQL, SQLLite, etc.) for reconciliation, data analysis, data reloading between different databases, etc.

The topic of parsing and interpreting program code has been of interest to me for a long time. And I decided to try to implement an interpreter for a language similar in syntax to PL/SQL. I chose Delphi as the interpreter implementation language.

Description of the invented programming language

The syntax of this language is very similar to Oracle. PL/SQL. Something was brought into him from PLpg/SQL for Postgres.

I will only describe what was implemented from the basic PL/SQL syntax, and what was changed and added.

1. Description of data types (what data types are supported).

  • VARCHAR2 – This data type stores a string. If the length of the string is specified, then the length will be checked in the code; if the length is not specified, then you can write data with a length of no more than 100000000.

  • INTEGER – This data type is used to work with fields that will only contain integer numeric data.

  • NUMBER – This data type is used to work with fields that will only contain fixed or floating point numeric data.

  • DATE – This data type stores values ​​in date format (dd.mm.yyyy hh24:mi:ss).

  • JSON – This data type is a format for storing and transmitting data in the form of objects, arrays and other values ​​that can be easily converted into a text string.

  • RECORD – This data type is similar to a string variable, but it does not have a predefined structure. They acquire their actual structure from the string that is assigned to them by the SELECT or FOR commands. The structure of a variable of type record can change each time a value is assigned.

  • BOOLEAN – This data type stores Boolean values ​​(true/false).

2. Cursors.

A cursor is a means of retrieving data from a database. Cursors contain definitions of columns and objects (tables, views, etc.) from which data will be retrieved, as well as a set of criteria that determine which rows should be selected.

The following methods are available to the user to perform a database query:

  • Implicit cursors – a simple SELECT … INTO statement retrieves one row of data directly into local program variables. This is a convenient (and often most efficient) way to access data, but may require recoding the SELECT statement (or similar statements) in multiple places in the program.

  • Explicit cursors – you can explicitly declare courses in the announcements section. In this case, the cursor can be opened and data retrieved in one or more programs, and the control options will be greater than with implicit cursors.

3. Operators FOR, WHILE, OPEN, IF.

In this implementation, you can use the following operators:

  • Loop operators: FOR, LOOP, WHILE.

  • Conditional operator: IF THEN ELSE.

  • Operator: OPEN FETCH.

4. Built-in procedures/functions have also been added.

  • dbms_output.put_line(text) – output a text message (as in Oracle).

  • upper(text) – converting text to upper case.

  • lower(text) – convert text to lower case.

  • rtrim(text) – cut off spaces on the right.

  • ltrim(text) – cut off spaces on the left.

  • trim(text) – cut off spaces on all sides.

  • replace(text, text_to_replace, replacement_text) – replacing text according to the sample.

  • substr(text, start_position, length) – extract a substring from a string.

  • instr(text, substring) – returns the nth occurrence of a substring in a string.

  • length(text) – returns the length of the string.

  • chr(number_code) – returns a character that is based on a numeric code.

  • host(command, work_dir) – execute the cmd command and return the result (work_dir – working directory (for example: c:\)).

Additional packages have also been implemented for working with local files, for sending http/https requests, and for working with json objects.

5. TEXT_IO package.

There is a built-in package for working with local files – TEXT_IO. (a similar package was once available in Oracle Forms)

It contains the following methods:

  • text_io.fopen( 'full path to file', 'key', encoding) – Opening a file. The key contains the following values ​​- 'r' – read-only file, 'w' – write-only to the file. Encoding – UTF8 or ANSI.

  • text_io.count_lines(file1) – Read the number of lines from a file.

  • text_io.get_line(file1, index) – Read a line from a file by index.

  • text_io.get_file_content(file1) – Read the entire contents of the file.

  • text_io.put_line(file1, 'line') – Write a string to a file.

  • text_io.fclose(file1) – Closing the file.

6. HTTP_IO package.

(Something similar also exists in Oracle)

To access data on the Internet via a protocol HTTP/HTTPS there is a built-in package – HTTP_IO. I use it to debug http/https requests to rest servers. It contains the following methods:

  • http_io.begin_request(url, method) – variable initialization.

  • http_io.set_header(http_req, 'key', 'value') – set title

  • http_io.set_resp_encoding(http_req, 'UTF8') – set the response encoding.

  • http_io.write_text(http_req, 'body of the message being sent') – set the body of the message to be sent.

  • http_io.get_response(http_req) – let's go to the URL and get the answer.

  • status_code – response code

  • response_text – response text

7. JSON_IO package.

There is a built-in package for working with JSON objects – JSON_IO. It contains the following methods:

  • json_io.add(js_obj1, 'key', 'value') – add a key-value pair to the json object.

  • json_io.format(js_obj1) – output json code in formatted form.

  • json_io.record_to_json(rec) – convert the value of a record type variable into json format.

  • json_io.count_keys(js_obj1) – display the number of key-value pairs.

  • json_io.count_rows(js_obj1) – display the number of values ​​inside the json array.

  • json_io.get_key_name(js_obj1, index) – display the name of the key by index.

  • json_io.get_key_value(js_obj1, index) – display the key value by index.

  • json_io.get_row(js_obj1, index) – display the json code of an array element by index.

8. RECORD_IO package.

There is a built-in package for working with record type variables – RECORD_IO. It contains the following methods:

  • record_io.add_field(rec1, [‘название_поля’, ‘тип_поля’]) – add a new field to a record type variable (new fields…)

  • record_io.print_record(rec1) – displaying the contents of a record type variable (rec1).

  • record_io.json_to_record(obj1, rec1) – convert the value of a json type variable (obj1) into a record type variable (rec1).

Since the interpreted code is executed on the client, it does not know in which database to execute SQL– instructions.

To achieve this, the following prerequisite has been added:

Inside the description any SQL-instructions (cursor, select/insert/update/delete-expressions) the label must be specified /*##db=dbname##*/which indicates the name of the database for which this construction will be executed. Descriptions of the parameters for connecting to the database are sent to the interpreter in advance.

Description of how the interpreter works

The implementation of the interpreter was divided into three stages:

At the first stage, I implemented a text parser for the executable program. The entire test was divided into tokens and placed in an array. There are a lot of examples of parser code on the Internet that divide the code into tokens of different types.

At the second stage, using the parsed tokens, I built an AST (abstract syntax tree) tree.

At the third stage, the interpreter itself is implemented, which traverses this tree and executes it. If the AST tree is built in sufficient detail, then the interpretation mechanism itself turns out to be simple.

Let's look at an example:

Interpretable (executable) code

declare 
  ind integer; 
  header_id_ number(10);
   
  cursor cur_main_1(arg_ varchar2) is 
      select * 
          from /*##db=db_name##*/ 
              tbl_name d 
          where d.status = 0 and 
              d.billhead_id = header_id_ and 
              (arg_ is null or d.name like '%'||arg_||'%') 
          order by d.name;
           
  cnt integer; 
  patt_name varchar2(255); 
begin 
  header_id_ := 911988; 
  patt_name := 'L-тироксин 150'; 
  dbms_output.put_line('---Test 1-----');
   
  -- Откроем курсор и внутри цикла for выполним курсор 
  for rec in cur_main_1(patt_name) loop 
      select count(*) 
          into cnt 
      from /*##db=db_name##*/ 
           tbl_name_2 mr 
      where mr.name = rec.name; 
      dbms_output.put_line(' Name="||rec.name||", cnt="||cnt); 
      dbms_output.put_line("rec.id='||rec.id); 
  end loop; 
end;

We divide the text into tokens (text parsing) and build an abstract syntax tree (AST). The more accurately the AST tree is written, the easier the further interpretation (execution) of the code is.

Constructed AST tree.

<Root xmlns="Root">
  <DECLARE value="DECLARE" line="1">
    <VARIABLE line="2" name="ind" type="integer" default_value="" type_size="" select="" is_null="Y"/>
    <VARIABLE line="3" name="header_id_" type="number" default_value="" type_size="10" select="" is_null="Y"/>
    <CURSOR line="1" name="cur_main_1" select="select * &#xA;          from /*##db=db_name##*/ &#xA;              tbl_name d &#xA;          where d.status = 0 and &#xA;              d.billhead_id = header_id_ and &#xA;              (arg_ is null or d.name like '%'||arg_||'%') &#xA;          order by d.name" conn_name="db_name">
      <ARGUMENTS>
        <ARGUMENT name="arg_" type="varchar2"/>
      </ARGUMENTS>
    </CURSOR>
    <VARIABLE line="14" name="cnt" type="integer" default_value="" type_size="" select="" is_null="Y"/>
    <VARIABLE line="15" name="patt_name" type="varchar2" default_value="" type_size="255" select="" is_null="Y"/>
  </DECLARE>
  <BEGIN value="BEGIN" line="16" exception="false">
    <STATEMENT line="17" tp="assignment">
      <IDENTIFIER val="header_id_"/>
      <ASSIGNMENT>
        <BLOCK_VAL>
          <CSTI_INTEGER val="911988"/>
        </BLOCK_VAL>
      </ASSIGNMENT>
    </STATEMENT>
    <STATEMENT line="18" tp="assignment">
      <IDENTIFIER val="patt_name"/>
      <ASSIGNMENT>
        <BLOCK_VAL>
          <CSTI_STRING val="L-тироксин 150"/>
        </BLOCK_VAL>
      </ASSIGNMENT>
    </STATEMENT>
    <STATEMENT line="19" tp="run">
      <IDENTIFIER val="dbms_output.put_line">
        <ARGUMENTS>
          <ARGUMENT>
            <CSTI_STRING val="---Test 1-----"/>
          </ARGUMENT>
        </ARGUMENTS>
      </IDENTIFIER>
    </STATEMENT>
    <FOR line="22">
      <PARAMS>
        <FOR_VAR var_name="rec"/>
        <FOR_CURSOR cur_name="cur_main_1">
          <ARGUMENTS>
            <ARGUMENT>
              <CSTI_IDENTIFIER val="patt_name"/>
            </ARGUMENT>
          </ARGUMENTS>
        </FOR_CURSOR>
      </PARAMS>
      <SELECT line="23" name="select_67484" select="select count(*) &#xA;          into cnt &#xA;      from /*##db=db_name##*/ &#xA;           tbl_name_2 mr &#xA;      where mr.name = rec.name" conn_name="db_name"/>
      <STATEMENT line="28" tp="run">
        <IDENTIFIER val="dbms_output.put_line">
          <ARGUMENTS>
            <ARGUMENT>
              <CSTI_STRING val=" Name="/>
              <CSTI_CONCATENATION val="||"/>
              <CSTI_IDENTIFIER val="rec.name"/>
              <CSTI_CONCATENATION val="||"/>
              <CSTI_STRING val=", cnt="/>
              <CSTI_CONCATENATION val="||"/>
              <CSTI_IDENTIFIER val="cnt"/>
            </ARGUMENT>
          </ARGUMENTS>
        </IDENTIFIER>
      </STATEMENT>
      <STATEMENT line="29" tp="run">
        <IDENTIFIER val="dbms_output.put_line">
          <ARGUMENTS>
            <ARGUMENT>
              <CSTI_STRING val="rec.id="/>
              <CSTI_CONCATENATION val="||"/>
              <CSTI_IDENTIFIER val="rec.id"/>
            </ARGUMENT>
          </ARGUMENTS>
        </IDENTIFIER>
      </STATEMENT>
    </FOR>
  </BEGIN>
</Root>

For some vertices within a section BEGIN line number is indicated. This is necessary so that the line number can be displayed if an error occurs while executing the program code.

I do not comment on the text of the XML tree, since it is easy to correlate with the executable code of the example.

At the next stage, the program code is interpreted (executed). The interpreter traverses the XML tree and executes the specified commands in it. Type variable values varchar2, integer, number, date And Boolean are stored as strings. For a variable of type json immediately created json-object. For a variable of type record When adding new fields, variables are created with the name – record_name.field_name.

When the interpreter reaches execution SQL-instructions (opening cursorsexecution select/insert/update/delete-expressions) then a connection to the specified database is created (if it has not yet been created). The name of the database must be indicated in the label /*##db=dbname##*/which must be present in every SQL statement. A general list of connection parameters to various databases can be transferred to the interpreter as a separate file.

After connecting to the database for SQL-instructions create a DataSource (DataSet) necessary to obtain data (select-expression) or execution insert/update/delete-expressions.

Some spelling errors (integrity of statements, absence of a semicolon after statements, etc.) in the executed code are caught at the construction stage AST-tree. Another part of the errors can be caught at the time of code interpretation (errors in the names of variables and internal built-in packages, etc.). In this case, an error is thrown indicating the line number.

Here are some code examples that are executed by the created interpreter.

Writing data to a file:

declare 
   cursor c is 
      select /*##db=db_name##*/ * from table; 
   res varchar2; 
   file1 text_io.file_type; 
   ind integer; 
begin 
   -- Все данные из курсора запишем в файл в виде JSON-массива 
   -- Откроем файл для записи 
   file1 := text_io.fopen('my_json_file.txt','w','UTF8'); 
   -- Запишем строку в файл 
   text_io.put_line( file1, '['); 
   ind := 0; 
   -- Откроем курсор 
   for rec in c loop 
      ind := ind + 1; 
      -- Преобразуем record в json 
      res := json_io.record_to_json(rec); 
      --Запишем данные в файл 
      if ind = 1 then 
       text_io.put_line(file1, res); 
      else 
       text_io.put_line(file1, ','||chr(10)||res); 
      end if; 
   end loop; 
   -- Запишем строку в файл 
   text_io.put_line(file1, ']'); 
   -- Закроем файл 
   text_io.fclose(file1); 
   dbms_output.put_line('ok'); 
end;

Sending a POST request:

declare 
    http_req http_io.req; 
    http_resp http_io.resp; 
begin 
    http_req := http_io.begin_request('http://localhost:1234', 'POST'); 
    http_io.set_header(http_req, 'Content-Type', 'application/json'); 
    http_io.set_resp_encoding(http_req, 'UTF8'); 
    http_io.write_text(http_req, '{"command":"FIND_INCOME_DOC", 
                                 "body":{"filter":{"start_date":"2024-06-26T00:00:00","end_date":"2024-06-26T23:59:59"},"start_from":0,"count":100} 
                                 }');     
    http_resp := http_io.get_response(http_req); 
    dbms_output.put_line('HTTP response status code: ' || http_resp.status_code); 
    dbms_output.put_line('HTTP response text: ' || http_resp.response_text); 
end;

Example of working with JSON

declare 
    obj json := '{"main_obj": {"1": 123, "2": 456}}'; 
    new_obj json; 
    arr json := '[]'; 
begin 
    dbms_output.put_line('1. obj = '||obj); 
    -- Добавим две пары ключ-значение 
    obj := json_io.add(obj, 'id', '1000', 'name', 'aaaaaa'); 
    dbms_output.put_line('2. obj = '||obj); 
    --Добавим новый подобъект 
    new_obj := '{"dddd": 890}'; 
    obj := json_io.add(obj, 'new_obj', new_obj); 
    dbms_output.put_line('3. obj = '||obj); 
    ------------ 
    --В массив arr добавим новый элемент 
    arr := json_io.add(arr, '', new_obj); 
    dbms_output.put_line('4. arr="||arr); 
    ------------ 
    --В массив arr добавим новый элемент 
    arr := json_io.add(arr, "', '{"aaaa": 111}'); 
    dbms_output.put_line('5. arr="||arr); 
    --Добавим массив в объект - obj 
    obj := json_io.add(obj, "rows', arr); 
    dbms_output.put_line('5. Результат: obj = '||json_io.format(obj)); 
end;

Opening cursors in different connections (Oracle and PostgreSQL) and running functions in them separately:

declare 
    -- 
    cursor cur_main_pg is 
        select * 
         from /*##db=db_PG##*/ 
             table_pg d 
         where d.billhead_id = 123456; 
    -- 
    cursor cur_main_ora is 
        select * 
         from /*##db=db_ORACLE##*/ 
             table_oracle d 
         where d.billhead_id = 123456; 
    -- 
    f_name varchar2; 
    num_val varchar2; 
begin 
    -- Открытие курсоров в разных соединениях и отдельный запуск функций в них 
    -- 
    dbms_output.put_line('--Start--PG'); 
    -- 
    for rec in cur_main_pg loop 
        -- Выполним функцию в БД и вернем ее результат 
        f_name := apt.n_get_reg_name/*##db=db_PG##*/(rec.registry_id); 
        -- Выполним функцию в БД и вернем ее результат 
        num_val := apt.apt_contracts__clear_numeric_val/*##db=db_PG##*/(' 123.5676'); 
        -- 
        dbms_output.put_line(' num_val="||num_val); 
        dbms_output.put_line("Name="||f_name||", meas_name="||apt.n_get_measure_name/*##db=db_PG##*/(rec.input_measure_id)); 
    end loop; 
    -- 
    dbms_output.put_line(" '); 
    dbms_output.put_line('--Start--Oracle'); 
    -- 
    for rec in cur_main_ora loop 
        -- Выполним функцию в БД и вернем ее результат 
        f_name := apt.n_get_reg_name/*##db=db_ORACLE##*/(rec.registry_id); 
        -- 
        dbms_output.put_line('Name="||f_name||", meas_name="||apt.n_get_measure_name/*##db=db_ORACLE##*/(rec.input_measure_id)); 
    end loop; 
end;

In conclusion, I would like to add that the implemented interpreter makes it possible to simplify operations for loading/unloading data from local files, comparing data obtained from different databases (Oracle, PostgreSQL, MsSQL, MySQL, SQLLite, etc.), direct import/export of data between different databases, debugging http/https requests, etc. Due to the fact that the invented language supports built-in SQL statements, less code is needed to implement all this than in other programming languages, such as Java, JavaScript, C#, Delphi, etc.

The interpreter can be designed as a separate console application (similar to python.exe, node.exe, etc.). In this case, the executable code must be saved as a file and will be passed as a separate parameter on the command line.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *