Skip to content

Cross-shard SELECT

A cross-shard SELECT query doesn't have one or has several sharding keys, which requires it to be executed by all shards. PgDog can perform it in parallel, assembling the results from each shard automatically. This makes it a powerful scatter/gather engine, with data nodes powered by PostgreSQL.

How it works

When PgDog receives a SELECT query with no (or multiple) sharding keys, it connects to all databases and sends the query to all of them in parallel.

If the result needs post-processing, e.g., to support sorting or aggregation, it will buffer the rows in memory and perform the necessary operations. Otherwise, PgDog will stream the rows directly to the client.

Predicate push-down

PgDog pushes all filtering, sorting and aggregation statements to the database. If the query is correctly constructed, the shards will return very few rows, allowing searches of vast quantities of data without causing out-of-memory errors or latency issues in the proxy.

Supported features

The SQL language allows for powerful data filtration and manipulation. While we aim to support most operations, currently, support for some cross-shard operations is limited as documented below:

Operation Supported Limitations
SELECT columns None.
ORDER BY Columns must be part of the returned tuples. See sorting.
DISTINCT / DISTINCT BY Columns must be part of the returned tuples.
GROUP BY Columns must be part of the returned tuples. See aggregates.
CTEs CTE must refer to data located on the same shard.
Window functions Not currently supported.
Subqueries Subqueries must refer to data located on the same shard.

Sorting

If the query contains an ORDER BY clause, PgDog can sort the rows returned from all shards automatically. This works by buffering data returned from all servers and sorting it in memory.

Currently, two forms of the ORDER BY SQL syntax are supported:

Syntax Example Notes
Order by column name ORDER BY id, email The column must be present in the returned tuples.
Order by column position ORDER BY 1, 2 The column is referenced by its position in the returned tuples.

Sorting by multiple columns is supported, including opposing sorting directions, for example:

SELECT id, email, created_at FROM users
ORDER BY
    1 ASC,
    created_at DESC

Note that columns in the ORDER BY clause are retrieved from the table. PgDog cannot sort by columns it doesn't receive from the databases.

Functions

PgDog currently doesn't support sorting results using a function, for example:

SELECT email FROM users ORDER BY coalesce(email, '')

To make this work, we need to implement all SQL functions inside PgDog. This is on the roadmap, but not currently a priority since the query can be easily rewritten to execute the function inside the database:

SELECT
    coalesce(email, '') -- executed by Postgres
FROM users
ORDER BY 1 -- sorted by both Postgres and PgDog

As a general rule, it's better to perform all data manipulations inside the target clause (SELECT [...]) and use the numbering notation in the ORDER BY clause to refer to the desired sorting order, e.g.:

SELECT
    id,
    email,
    first_name || last_name,
    substring(first_name FROM 1 FOR 5) AS short
FROM users ORDER BY
    3 ASC, -- first_name || last_name
    1 DESC -- id

ORMs

ORMs like SQLAlchemy or ActiveRecord, more often than not, will write queries that work with PgDog out of the box. This is because they tend to fetch entire rows and use fully-qualified names in all parts of the statement, including the ORDER BY clause.

For example, this is what a first Rails/ActiveRecord query looks like:

SELECT * FROM users ORDER BY users.id LIMIT 1

The users.id column is present in the returned row, so PgDog can read it and sort the rows in the desired order.

Aggregates

Aggregates are transformative functions: instead of returning rows as-is, they return calculated summaries, like a sum or a count. Many aggregate functions are cumulative: the final value can be calculated from partial results returned by each shard.

If an aggregate function is supported (see list of supported functions below), this is handled by PgDog automatically:

Aggregate functions Supported Notes
count, count(*) Works for most data types.
max, min, avg, sum Works for most data types.
stddev, variance Works for most data types.
percentile_disc, percentile_cont Not currently supported and very expensive to calculate on large datasets.
*_agg Not currently supported.
json_* Not currently supported.

Aggregate functions have to appear in the target clause of the statement (SELECT [...]), and can also be combined with sorting, for example:

SELECT COUNT(*), is_admin
FROM users
GROUP BY 2
ORDER BY 1 DESC

HAVING clause

The HAVING clause requires additional filtering of the results and is not currently supported. See #695 for more details.

Rewriting queries

For some aggregate functions to work as expected, each shard may need to return columns and intermediate calculations not present in the original query.

For example, to get an average of a column, we need to fetch the row count, multiply it by the avg of the column on each shard, and divide it by the total count of rows on all shards.

If the count function isn't present in the query, PgDog will automatically rewrite the query to add it. This allows queries, like the following example, to just work without modifications:

SELECT avg(price) FROM orders;
SELECT avg(price), count(price) FROM orders;

PgDog automatically removes the count column from the returned rows, so applications don't have to handle this complexity.

The following aggregate functions take advantage of this feature:

  • avg
  • stddev
  • variance

Configuration

This feature is enabled by default and requires no additional configuration.

Supported data types

The following table lists the data types supported by PgDog for ordering and aggregation. Since Postgres clients can request results in either text or binary format, each one must be handled separately:

Data type Sorting Aggregation Text format Binary format
BIGINT, INTEGER, SMALLINT
VARCHAR, TEXT
NUMERIC
REAL, DOUBLE PRECISION
INTERVAL No
TIMESTAMP, TIMESTAMPTZ No
UUID
VECTOR Only by L2

pgvector data types

VECTOR type doesn't have a fixed OID in Postgres because it comes from an extension (pgvector). We infer it from the <-> operator used in the ORDER BY clause.