PostgreSQL Paging Benchmarks

This is a performance benchmark project that tests different data paging approaches in PostgreSQL. You can find the source code in the PgPagingBenchmarks repository.

Setup

PostgreSQL 16 instance on my laptop:

select version();
PostgreSQL 16.0 (Ubuntu 16.0-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit
select version();
PostgreSQL 16.0 (Ubuntu 16.0-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit

Initial schema (source):

begin;

drop schema if exists example cascade;
create schema example;

create table example.cities (
    city_id int generated always as identity primary key,
    name text not null unique
);

create table example.addresses (
    address_id int generated always as identity primary key,
    street text not null,
    city_id int not null references example.cities deferrable,
    unique (street, city_id)
);

create table example.customers (
    customer_id int generated always as identity primary key,
    name text not null,
    address_id int not null references example.addresses deferrable,
    unique (name, address_id)
);

create table example.customer_addresses (
    customer_id int not null references example.customers deferrable,
    address_id int not null references example.addresses deferrable,
    primary key (customer_id, address_id)
);

alter table example.customers 
add constraint fk_customer_addresses
foreign key (customer_id, address_id) 
references example.customer_addresses deferrable;

end;
begin;

drop schema if exists example cascade;
create schema example;

create table example.cities (
    city_id int generated always as identity primary key,
    name text not null unique
);

create table example.addresses (
    address_id int generated always as identity primary key,
    street text not null,
    city_id int not null references example.cities deferrable,
    unique (street, city_id)
);

create table example.customers (
    customer_id int generated always as identity primary key,
    name text not null,
    address_id int not null references example.addresses deferrable,
    unique (name, address_id)
);

create table example.customer_addresses (
    customer_id int not null references example.customers deferrable,
    address_id int not null references example.addresses deferrable,
    primary key (customer_id, address_id)
);

alter table example.customers 
add constraint fk_customer_addresses
foreign key (customer_id, address_id) 
references example.customer_addresses deferrable;

end;

Tables were seeded with initial data using the Faker Bogus library (source script).

Requirements

The query that was tested is this:

select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
from 
    example.customers
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
where customers.name ilike $1
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
from 
    example.customers
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
where customers.name ilike $1
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name
  • The query needs to return customer data along with the default address.
  • The query also needs to return a total count of customer addresses.
  • The query needs to return the last page with a length of 10 rows. In this case, this is the page 625.
  • Query needs to be filtered by customer name pattern. In this case this %john% pattern.
  • Total rows unpaged count is also required.
  • Pages data needs to be sorted by the customer name.

And finally, results need to be serialized into the following structure:

public class City
{
    public int Id { get; set; }
    public string Name { get; set; } = default!;
}

public class Address
{
    public int Id { get; set; }
    public string Street { get; set; } = default!;
    public City City { get; set; } = default!;
}

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; } = default!;
    public Address Address { get; set; } = default!;
    public long AddressCount { get; set; }
}

public class DataPage
{
    public long Count { get; set; }
    public List<Customer> Customers { get; set; } = default!;
}
public class City
{
    public int Id { get; set; }
    public string Name { get; set; } = default!;
}

public class Address
{
    public int Id { get; set; }
    public string Street { get; set; } = default!;
    public City City { get; set; } = default!;
}

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; } = default!;
    public Address Address { get; set; } = default!;
    public long AddressCount { get; set; }
}

public class DataPage
{
    public long Count { get; set; }
    public List<Customer> Customers { get; set; } = default!;
}

As we can see, keyset paging is not possible. However, there are a few other possibilities we can test and explore.

Test methods

Method 1

SQL scripts in two steps:

  • The first step: get the total count from the filtered table.
  • The second step: use limit and offset on a query to get the page.
  • Map results by position from the reader fields.
  • Source Code
select count(*)  from example.customers where name ilike $1
select count(*)  from example.customers where name ilike $1
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    example.customers
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
where customers.name ilike $1
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    example.customers
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
where customers.name ilike $1
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3

Method 2

Single SQL query:

  • The table is filtered in materialized CTE.
  • Joins CTE with the main query.
  • Use limit and offset on a query to get the page.
  • The count is in additional subquery: (select count(*) from cte) as count
  • Map results by position from the reader fields.
  • Source Code
with cte as materialized (
    select customer_id 
    from example.customers 
    where name ilike $1
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3
with cte as materialized (
    select customer_id 
    from example.customers 
    where name ilike $1
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3

Method 3

Single SQL query:

  • The table is filtered in materialized CTE that includes the row number.
  • Joins CTE with the main query.
  • Filter by the row number to get the page.
  • Order by the customer is in the materialized CTE.
  • The count is in the additional subquery that finds the max row number: (select max(row) from cte) as count
  • Source Code
with cte as materialized (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike $1
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
with cte as materialized (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike $1
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row

Method 4

SQL script that creates the TEMP table within a transaction:

  • Create transaction.
  • Create the TEMP table from filtered data.
  • Return count from the TEMP table.
  • Join the TEMP table with the main query and use the offset and limit to page the data.
  • Source Code
begin
begin
create temp table _temp_customers on commit drop as
select customer_id from example.customers where name ilike $1
create temp table _temp_customers on commit drop as
select customer_id from example.customers where name ilike $1
select count(*) from _temp_customers
select count(*) from _temp_customers
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3
end
end

Method 5

SQL script that creates the TEMP table with row number within a transaction:

  • Create transaction.
  • Create the TEMP table with the row number from filtered data and order by customer.
  • Return the max row from the TEMP table.
  • Join the TEMP table with the main query and use the row number to page the data.
  • Order by row number.
  • Source Code
begin
begin
create temp table _temp_customers on commit drop as
select row_number() over() as row, customer_id 
from example.customers 
where name ilike $1
order by customers.name
create temp table _temp_customers on commit drop as
select row_number() over() as row, customer_id 
from example.customers 
where name ilike $1
order by customers.name
select max(row) from _temp_customers
select max(row) from _temp_customers
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
end
end

Method 6

SQL script that creates the indexed TEMP table with row number within a transaction:

  • Create transaction.
  • Create the TEMP table with the row number from filtered data and order by customer.
  • Create the BTREE index on the TEMP table.
  • Return the max row from the TEMP table.
  • Join the TEMP table with the main query and use the row number to page the data.
  • Order by row number.
  • Source Code
begin
begin
create temp table _temp_customers on commit drop as
select row_number() over() as row, customer_id 
from example.customers 
where name ilike $1
order by customers.name
create temp table _temp_customers on commit drop as
select row_number() over() as row, customer_id 
from example.customers 
where name ilike $1
order by customers.name
create index on _temp_customers using btree (row)
create index on _temp_customers using btree (row)
select max(row) from _temp_customers
select max(row) from _temp_customers
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count
from 
    _temp_customers
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
end
end

Method 7

PLPGSQL function that returns JSON:

  • Create the TEMP table from filtered data.
  • Get the count from the inserted diagnostics.
  • Join the TEMP table with the main query and use thelimt and offset to page the data.
  • Build a JSON response and deserialize it on the client.
  • Source Code
create or replace function example.method7(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select customer_id 
    from example.customers 
    where name ilike _search;
    
    get diagnostics _count = row_count;
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where customers.name ilike '%john%'
                group by
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by customers.name 
                offset _skip limit _take

            ) sub
        )

    );
end
$$;
create or replace function example.method7(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select customer_id 
    from example.customers 
    where name ilike _search;
    
    get diagnostics _count = row_count;
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where customers.name ilike '%john%'
                group by
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by customers.name 
                offset _skip limit _take

            ) sub
        )

    );
end
$$;

source

Method 8

PLPGSQL function that returns JSON:

  • Create the TEMP table with the row number from filtered data and order by customer.
  • Get the count from the inserted diagnostics.
  • Join the TEMP table with the main query and use the row number to page the data.
  • Build a JSON response and deserialize it on the client.
  • Source Code
create or replace function example.method8(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where row > $2 and row <= $2 + $3
                group by
                    row,
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by row

            ) sub
        )

    );
end
$$;
create or replace function example.method8(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where row > $2 and row <= $2 + $3
                group by
                    row,
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by row

            ) sub
        )

    );
end
$$;

Method 9

PLPGSQL function that returns JSON:

  • Create the indexed TEMP table with the row number from filtered data and order by customer.
  • Get the count from the inserted diagnostics.
  • Create the BTREE index on the TEMP table.
  • Join the TEMP table with the main query and use the row number to page the data.
  • Build a JSON response and deserialize it on the client.
  • Source Code
create or replace function example.method9(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;

    create index on _temp_customers using btree (row);
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where row > $2 and row <= $2 + $3
                group by
                    row,
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by row

            ) sub
        )

    );
end
$$;
create or replace function example.method9(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns json
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;

    create index on _temp_customers using btree (row);
    
    return json_build_object(
        'count', _count,
        'customers', (
            select json_agg(sub)
            from ( 
            
                select 
                    customers.customer_id as id,
                    customers.name,
                    json_build_object(
                        'id', customers.address_id,
                        'street', street,
                        'city', json_build_object(
                            'id', cities.city_id,
                            'name', cities.name
                        )
                    ) as address,
                    count(*) as AddressCount
                from 
                    _temp_customers
                    join example.customers using (customer_id)
                    join example.addresses using (address_id)
                    join example.cities using (city_id)
                    join example.customer_addresses using (customer_id)
                where row > $2 and row <= $2 + $3
                group by
                    row,
                    customers.customer_id,
                    customers.name,
                    customers.address_id,
                    street,
                    cities.city_id,
                    cities.name
                order by row

            ) sub
        )

    );
end
$$;

Method 10

PLPGSQL function that returns TABLE:

  • Create the indexed TEMP table with the row number from filtered data and order by customer.
  • Get the count from the inserted diagnostics.
  • Create the BTREE index on the TEMP table.
  • Join the TEMP table with the main query and use the row number to page the data.
  • Return the table and map by position on the client.
  • Source Code
create or replace function example.method10(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;

    create index on _temp_customers using btree (row);
    
    return query
    select 
        customers.customer_id,
        customers.name,
        customers.address_id,
        street,
        cities.city_id,
        cities.name,
        count(*) as address_count,
        _count
    from 
        _temp_customers
        join example.customers using (customer_id)
        join example.addresses using (address_id)
        join example.cities using (city_id)
        join example.customer_addresses using (customer_id)
    where row > $2 and row <= $2 + $3
    group by
        row,
        customers.customer_id,
        customers.name,
        customers.address_id,
        street,
        cities.city_id,
        cities.name
    order by row;
end
$$;
create or replace function example.method10(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language plpgsql 
as $$
declare
    _count bigint;
begin
    create temp table _temp_customers on commit drop as
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name;
    
    get diagnostics _count = row_count;

    create index on _temp_customers using btree (row);
    
    return query
    select 
        customers.customer_id,
        customers.name,
        customers.address_id,
        street,
        cities.city_id,
        cities.name,
        count(*) as address_count,
        _count
    from 
        _temp_customers
        join example.customers using (customer_id)
        join example.addresses using (address_id)
        join example.cities using (city_id)
        join example.customer_addresses using (customer_id)
    where row > $2 and row <= $2 + $3
    group by
        row,
        customers.customer_id,
        customers.name,
        customers.address_id,
        street,
        cities.city_id,
        cities.name
    order by row;
end
$$;

Method 11

SQL function that returns TABLE:

  • Use un-materialized CTE without ROW numbers.
  • Same as method 2, only in SQL function.
  • Notes: labeling function as STABLE or CTE as MATERIALIZED degrades performances.
  • Source Code
create or replace function example.method11(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language sql 
/*stable*/
as $$
with cte as /*materialized*/ (
    select customer_id 
    from example.customers 
    where name ilike _search
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset _skip limit _take
$$;
create or replace function example.method11(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language sql 
/*stable*/
as $$
with cte as /*materialized*/ (
    select customer_id 
    from example.customers 
    where name ilike _search
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset _skip limit _take
$$;

Method 12

SQL function that returns TABLE:

  • Use un-materialized CTE with ROW numbers.
  • Same as method 3, only in SQL function.
  • Notes: labeling function as STABLE or CTE as MATERIALIZED degrades performances.
  • Source Code
create or replace function example.method12(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language sql 
/*stable*/
as $$
with cte as /*materialized*/ (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > _skip and row <= _skip + _take
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
$$;
create or replace function example.method12(
    _search varchar, 
    _skip integer, 
    _take integer
) 
returns table (
    _customer_id int,
    _name text,
    _address_id int,
    _street text,
    _city_id int,
    _city_name text,
    _address_count bigint,
    _count bigint
)
language sql 
/*stable*/
as $$
with cte as /*materialized*/ (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike _search
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > _skip and row <= _skip + _take
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
$$;

Method 13

SQL query:

  • Use un-materialized CTE without ROW numbers.
  • Same as method 2, only without materialization.
  • Source Code
with cte as (
    select customer_id 
    from example.customers 
    where name ilike $1
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3
with cte as (
    select customer_id 
    from example.customers 
    where name ilike $1
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select count(*) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
group by
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by customers.name 
offset $2 limit $3

Method 14

SQL query:

  • Use un-materialized CTE with ROW numbers.
  • Same as method 3, only without materialization.
  • Source Code
with cte as (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike $1
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row
with cte as (
    select row_number() over() as row, customer_id 
    from example.customers 
    where name ilike $1
    order by customers.name
)
select 
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name,
    count(*) as address_count,
    (select max(row) from cte) as count
from 
    cte
    join example.customers using (customer_id)
    join example.addresses using (address_id)
    join example.cities using (city_id) 
    join example.customer_addresses using (customer_id)
where row > $2 and row <= $2 + $3
group by
    row,
    customers.customer_id,
    customers.name,
    customers.address_id,
    street,
    cities.city_id,
    cities.name
order by row

Results

Round1

TableCount
example.cities432,166
example.addresses3,996,491
example.customers500,000
example.customer_addresses3,996,491
MethodMeanErrorStdDevRatio
Method12.239 s0.0204 s0.0190 s1.00
Method22.267 s0.0168 s0.0157 s1.01
Method32.133 s0.0139 s0.0130 s0.95
Method42.273 s0.0120 s0.0113 s1.02
Method52.154 s0.0235 s0.0220 s0.96
Method62.137 s0.0157 s0.0147 s0.95
Method72.316 s0.0144 s0.0135 s1.03
Method82.136 s0.0178 s0.0167 s0.95
Method92.147 s0.0153 s0.0143 s0.96
  • Method 9 mistakenly was executing method 8, that result is not valid.

Round2

TableCount
example.cities450,762
example.addresses5,596,217
example.customers700,000
example.customer_addresses5,596,218
MethodMeanErrorStdDevRatioRatioSD
Method12.317 s0.0157 s0.0147 s1.000.00
Method22.339 s0.0128 s0.0113 s1.010.01
Method32.153 s0.0159 s0.0149 s0.930.01
Method42.387 s0.0456 s0.0593 s1.030.03
Method52.163 s0.0156 s0.0146 s0.930.01
Method62.167 s0.0105 s0.0093 s0.940.01
Method72.402 s0.0177 s0.0165 s1.040.01
Method82.159 s0.0147 s0.0137 s0.930.01
Method92.167 s0.0168 s0.0157 s0.940.01
Method102.160 s0.0146 s0.0130 s0.930.01
Method112.231 s0.0178 s0.0167 s0.960.01
Method122.176 s0.0422 s0.0414 s0.940.02
Method132.338 s0.0195 s0.0173 s1.010.01
Method142.156 s0.0121 s0.0113 s0.930.01

Conclusion

PostgreSQL 16 is incredibly optimized. I'm Gonna Need a Bigger Boat, I mean the dataset.

Edit, round2:

  • With a slightly bigger set difference is slightly bigger, but still in the milliseconds range and not worth optimizing further.
  • Surprisingly CTE methods are a bit slower when not materialized. Probably because they fit into memory, on bigger result sets this should not be the case.
  • Even more surprisingly, functions labeled as STABLE will experience a serious degradation in performance, which was unexpected.
  • SQL function in method 12 was even returning a wrong page when labeled as STABLE and was using MATERIALIZED CTE.

If anyone has a different method, let me know, I'd like to include that in tests too.

To receive notifications about new posts and updates, consider subscribing to my LinkdIn page:
vb-software linkedin

You will receive notifications about new posts on your LinkedIn feed.
Comments