Strengthening PostgreSQL with PL

To get started, you just need to install the extension in PostgreSQL:

CREATE EXTENSION plpython3u;

Fundamentals of Creating Functions in PL/Python

PL/Python is a procedural language inside PostgreSQL that allows you to write functions in Python. Essentially, it's like a regular SQL function, but instead of SQL queries, you write Python code.

Here's the basic design:

CREATE FUNCTION function_name(argument_list)
RETURNS return_type
AS $$
    # Здесь идет Python-код
$$ LANGUAGE plpythonu;

Everything in between $ And $is pure Python. PostgreSQL automatically passes parameters as global variables to the function, allowing you to work with them like regular Python variables.

Let's create a function that returns the larger of two numbers:

CREATE FUNCTION pymax(a integer, b integer)
RETURNS integer
AS $$
    if a > b:
        return a
    return b
$$ LANGUAGE plpythonu;

This function takes two integers and returns the larger one. As you can see, it is very similar to regular Python code, only it runs inside PostgreSQL.

If no value is returned, PostgreSQL will return NULL.

Now let's dig deeper. In PL/Python, arguments are passed as global variables, which opens up a lot of possibilities. But there is a caveat: if you try to reassign an argument inside a function, you may get an error.

CREATE FUNCTION pystrip(x text)
RETURNS text
AS $$
    x = x.strip()  # ошибка
    return x
$$ LANGUAGE plpythonu;

The error occurs because when a variable is assigned a new value, it becomes local. In this case, Python assumes that the variable has not yet been initialized and issues an error.

We need to explicitly indicate that we are using global variables:

CREATE FUNCTION pystrip(x text)
RETURNS text
AS $$
    global x
    x = x.strip()  # теперь всё работает
    return x
$$ LANGUAGE plpythonu;

Now everything works correctly and Python no longer complains.

It's a small thing, but a mistake like this can ruin your whole day.

PL/Python also has the ability to interact with tables directly through Python. For example, you can write a function that retrieves data from a table, processes it, and returns the result.

Let's create a function that retrieves the user's email by his ID:

CREATE FUNCTION get_user_email(user_id integer)
RETURNS text
AS $$
    query = plpy.execute(f"SELECT email FROM users WHERE id = {user_id}")
    if query:
        return query[0]['email']
    return None
$$ LANGUAGE plpythonu;

Here we use the built-in function plpy.execute() to execute an SQL query directly from Python. It returns the result as a list of dictionaries, which is convenient for further processing.

Integration with external libraries

PL/Python supports all Python libraries installed on the server running PostgreSQL.

Pandas example. Let's say you have sales data, and you need to quickly analyze it: the amount, average and median by month.

CREATE FUNCTION analyze_sales()
RETURNS table(month text, total_sales numeric, average_sales numeric, median_sales numeric)
AS $$
    import pandas as pd

    result = plpy.execute("SELECT month, sales FROM sales_data")
    df = pd.DataFrame(result)

    df_summary = df.groupby('month')['sales'].agg(['sum', 'mean', 'median']).reset_index()

    return df_summary.to_dict(orient="records")
$$ LANGUAGE plpythonu;

Here we are:

  1. Getting data from the table sales_data by using plpy.execute().

  2. Let's convert the result into a DataFrame via pandas for analysis.

  3. We group the data by month and calculate the sum, mean and median.

  4. We return the result back to the database.

Let's move on to data arrays. If you need to work with large arrays or matrices, numpy comes to the rescue.

Let's say you need to calculate the average and standard deviation of a data array:

CREATE FUNCTION calculate_statistics(arr double precision[])
RETURNS table(mean double precision, stddev double precision)
AS $$
    import numpy as np

    np_arr = np.array(arr)

    mean = np.mean(np_arr)
    stddev = np.std(np_arr)

    return [{'mean': mean, 'stddev': stddev}]
$$ LANGUAGE plpythonu;

We transfer the data array into PostgreSQL, convert it to numpy-array, and then perform the necessary calculations.

Exceptions and Error Handling

Handling exceptions in PL/Python is as easy as in regular Python. Here's an example of a function that divides two numbers while handling possible errors:

CREATE FUNCTION safe_divide(a float, b float)
RETURNS float
AS $$
    try:
        return a / b
    except ZeroDivisionError:
        plpy.error("Деление на ноль невозможно!")
    except Exception as e:
        plpy.error(f"Произошла ошибка: {e}")
$$ LANGUAGE plpythonu;

Here we process ZeroDivisionError and any other exceptions, sending errors to the PostgreSQL log via plpy.error().

In production, it is always better to provide for handling all possible exceptions to avoid system crashes.

Triggers and transactions

Triggers in PL/Python work the same way as in regular SQL. They are triggered when data is inserted, updated, or deleted, and can perform certain actions.

Let's create a trigger function that checks the number of products in the order before inserting:

CREATE FUNCTION validate_order_quantity()
RETURNS trigger
AS $$
    if NEW.quantity <= 0:
        raise plpy.Error('Количество товаров должно быть больше нуля!')
    return NEW
$$ LANGUAGE plpythonu;

This trigger checks every insert into the orders table and ensures that the item quantity is greater than zero.

Now let's create the trigger itself:

CREATE TRIGGER check_quantity
BEFORE INSERT OR UPDATE ON orders
FOR EACH ROW
EXEUTE FUNCTION validate_order_quantity();

The trigger fires before the insert or update and calls our function validate_order_quantity().

Transaction management

PL/Python allows you to manage transactions manually. An example where we explicitly control the transaction:

CREATE FUNCTION transaction_test()
RETURNS void
AS $$
    try:
        plpy.execute("BEGIN;")
        plpy.execute("INSERT INTO test_table VALUES (1);")
        plpy.execute("INSERT INTO test_table VALUES (2);")
        plpy.execute("COMMIT;")
    except:
        plpy.execute("ROLLBACK;")
        raise
$$ LANGUAGE plpythonu;

Here we start the transaction with BEGINperform several operations and complete it with the command COMMIT. If something goes wrong, we roll back the transaction using ROLLBACK.


In conclusion, let me remind you about the upcoming open lessons that will be held as part of the “PostgreSQL Advanced” course:

  • October 10: Rules for working with analytical queries. Optimization in OLTP and OLAP. Record

  • October 23: PostgreSQL and DevOps – managing the database via CI/CD and Kubernetes. Record

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *