Execute Custom Function For Each Record

floki

Nov 6, 2023

Start with a query that returns a series of numbers: select q.i from generate_series(1, 10) q(i)
The goal is to execute a custom function for each record in this select (this is just an example; query could be anything).
That something will always return a previous value for the field i in the q(i) test table.
That is essentially a custom implementation of the LAG window function, but this serves as a proof of concept to demonstrate that we can do a custom function call for each value in any SELECT.
We will do this in a function that test_rec() that returns table (i, j), where i is a value from generate_series and j is the result from a custom function executed for each result that represents a previous (lag) value.
First, we need a wrap-up that generate-series in a subquery because generate_series returns a number, not a table record, and we need a table record:

select *
from (
    select q.i
    from generate_series(1, 10) q(i)
) as sub;

select *
from (
    select q.i
    from generate_series(1, 10) q(i)
) as sub;

Next, inside our test_rec(), we will create a new function that will do this custom processing. This function will receive entire record and return the result of the previous (lag) calculation.

create function pg_temp._parse(_rec record)
  returns int 
  language plpgsql
  as
  $$
  declare _ret int;
  begin
      -- do the calculation here
      return _ret;
  end;
  $$;

create function pg_temp._parse(_rec record)
  returns int 
  language plpgsql
  as
  $$
  declare _ret int;
  begin
      -- do the calculation here
      return _ret;
  end;
  $$;

This function is created on the pg_temp schema. That means it is visible only to the current connection (session), and it will be dropped when the connection (session) ends. Alternatively, this function could be global at any schema.
Additionally, this function must be plpgsql function because plain sql function can't work with the record type, and we have a parameter which is generic (any) table record _rec record.
Note: record types are great for implementing generic logic that can work with many different records and tables.
To be able to perform this type of calculation with the previous value (lag), we need to remember the previous value.
Functions, unfortunately, can't capture variables, and only way to implement this is to use custom settings for PostgreSQL.
This allows us to have variables and values that are visible only to the current connection (session).
In the first step, we need to reset the initial value:

reset test_rec.prev;

reset test_rec.prev;

Note that the name is scoped to this function test_rec.prev. Unscoped names are usually reserved for the system.
Reset will set this value to empty value.
In the first step of the record process function, we capture the return value from this variable:

_ret = nullif(current_setting('test_rec.prev', true), '')::int;

_ret = nullif(current_setting('test_rec.prev', true), '')::int;

Two things here: 1) settings are always text type and we need to cast them to int with ::int 2) we to check nullif to set to null empty strings.
Next, we can set the previous value and return the result:

perform set_config('test_rec.prev', _rec.i::text, true);
return _ret;

perform set_config('test_rec.prev', _rec.i::text, true);
return _ret;

And now, we can modify the result query to pass the record to this function:

return query
select sub.i, pg_temp._parse(sub) as j 
from (
    select q.i
    from generate_series(1, 10) q(i)
) as sub;

return query
select sub.i, pg_temp._parse(sub) as j 
from (
    select q.i
    from generate_series(1, 10) q(i)
) as sub;

To wrap it all up, this is how it looks like in the end:

create or replace function test_rec()
returns table (
    i int,
    j int
)
language plpgsql
as
$func$
begin

    reset test_rec.prev;

    create or replace function pg_temp._parse(_rec record)
    returns int 
    language plpgsql
    as
    $$
    declare _ret int;
    begin
        _ret = nullif(current_setting('test_rec.prev', true), '')::int;
        perform set_config('test_rec.prev', _rec.i::text, true);
        return _ret;
    end;
    $$;

    return query
    select sub.i, pg_temp._parse(sub) as j 
    from (
        select q.i
        from generate_series(1, 10) q(i)
    ) as sub;
end;
$func$;

create or replace function test_rec()
returns table (
    i int,
    j int
)
language plpgsql
as
$func$
begin

    reset test_rec.prev;

    create or replace function pg_temp._parse(_rec record)
    returns int 
    language plpgsql
    as
    $$
    declare _ret int;
    begin
        _ret = nullif(current_setting('test_rec.prev', true), '')::int;
        perform set_config('test_rec.prev', _rec.i::text, true);
        return _ret;
    end;
    $$;

    return query
    select sub.i, pg_temp._parse(sub) as j 
    from (
        select q.i
        from generate_series(1, 10) q(i)
    ) as sub;
end;
$func$;

This demonstrates couple of interesting concepts:

Usage of record type in plpgsql and how it can be used to process records with custom logic.
Usage of temporary functions scoped to the session (connection).
Usage of configuration values scoped to the session (connection).

The last two concepts assume that there will be no parallel calls on the same connection which is the bad practice anyway (if it is even possible).
And now, we can call this:

select * from test_rec()

select * from test_rec()

`i`	`j`
1	`NULL`
2	1
3	2
4	3
5	4
6	5
7	6
8	7
9	8
10	9

To receive notifications about new posts and updates, consider subscribing to my LinkdIn page:
vb-software linkedin

You will receive notifications about new posts on your LinkedIn feed.

Comments

Comments are not available.