Skip to content

Backport 14 7 - 14 8 changes range in REL_2_STABLE#1701

Draft
reshke wants to merge 17 commits intoapache:REL_2_STABLEfrom
reshke:14_7_14_8
Draft

Backport 14 7 - 14 8 changes range in REL_2_STABLE#1701
reshke wants to merge 17 commits intoapache:REL_2_STABLEfrom
reshke:14_7_14_8

Conversation

@reshke
Copy link
Copy Markdown
Contributor

@reshke reshke commented Apr 27, 2026

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


michaelpq and others added 16 commits April 23, 2026 09:51
Try to disable ASLR when building in EXEC_BACKEND mode, to avoid random
memory mapping failures while testing.  For developer use only, no
effect on regular builds.

This has been originally applied as of f3e7806 for v15~, but
recently-added buildfarm member gokiburi tests this configuration on
older branches as well, causing it to fail randomly as ASLR would be
enabled.

Suggested-by: Andres Freund <andres@anarazel.de>
Tested-by: Bossart, Nathan <bossartn@amazon.com>
Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de
Backpatch-through: 12
It appears no longer possible to build the SGML docs without a local
installation of the DocBook DTD, because sourceforge.net now only
permits HTTPS access, and no common version of xsltproc supports that.
Hence, remove the bits of our documentation suggesting that that's
possible or useful.

In fact, we might as well add the --nonet option to the build recipes
automatically, for a bit of extra security.

Also fix our documentation-tool-installation recipes for macOS to
ensure that xmllint and xsltproc are pulled in from MacPorts or
Homebrew.  The previous recipes assumed you could use the
Apple-supplied versions of these tools; which still works, except that
you'd need to set an environment variable to ensure that they would
find DTD files provided by those package managers.  Simpler and easier
to just recommend pulling in the additional packages.

In HEAD, also document how to build docs using Meson, and adjust
"ninja docs" to just build the HTML docs, for consistency with the
default behavior of doc/src/sgml/Makefile.

In a fit of neatnik-ism, I also made the ordering of the package
lists match the order in which the tools are described at the head
of the appendix.

Aleksander Alekseev, Peter Eisentraut, Tom Lane

Discussion: https://postgr.es/m/CAJ7c6TO8Aro2nxg=EQsVGiSDe-TstP4EsSvDHd7DSRsP40PgGA@mail.gmail.com
When an aggregate function is used as a WindowFunc and a tuple transitions
out of the window frame, we ordinarily try to make use of the aggregate
function's inverse transition function to "unaggregate" the exiting tuple.

This optimization is disabled for various cases, including when the
aggregate contains a volatile function.  In such a case we'd be unable to
ensure that the transition value was calculated to the same value during
transitions and inverse transitions.  Unfortunately, we did this check by
calling contain_volatile_functions() which does not recursively search
SubPlans for volatile functions.  If the aggregate function's arguments or
its FILTER clause contained a subplan with volatile functions then we'd
fail to notice this.

Here we fix this by just disabling the optimization when the WindowFunc
contains any subplans.  Volatile functions are not the only reason that a
subplan may have nonrepeatable results.

Bug: #17777
Reported-by: Anban Company
Discussion: https://postgr.es/m/17777-860b739b6efde977%40postgresql.org
Reviewed-by: Tom Lane
Backpatch-through: 11
…ates

OpenSSL 1.1.1 and newer versions have added support for RSA-PSS
certificates, which requires the use of a specific routine in OpenSSL to
determine which hash function to use when compiling it when using
channel binding in SCRAM-SHA-256.  X509_get_signature_nid(), that is the
original routine the channel binding code has relied on, is not able to
determine which hash algorithm to use for such certificates.  However,
X509_get_signature_info(), new to OpenSSL 1.1.1, is able to do it.  This
commit switches the channel binding logic to rely on
X509_get_signature_info() over X509_get_signature_nid(), which would be
the choice when building with 1.1.1 or newer.

The error could have been triggered on the client or the server, hence
libpq and the backend need to have their related code paths patched.
Note that attempting to load an RSA-PSS certificate with OpenSSL 1.1.0
or older leads to a failure due to an unsupported algorithm.

The discovery of relying on X509_get_signature_info() comes from Jacob,
the tests have been written by Heikki (with few tweaks from me), while I
have bundled the whole together while adding the bits needed for MSVC
and meson.

This issue exists since channel binding exists, so backpatch all the way
down.  Some tests are added in 15~, triggered if compiling with OpenSSL
1.1.1 or newer, where the certificate and key files can easily be
generated for RSA-PSS.

Reported-by: Gunnar "Nick" Bluth
Author: Jacob Champion, Heikki Linnakangas
Discussion: https://postgr.es/m/17760-b6c61e752ec07060@postgresql.org
Backpatch-through: 11
ruleutils.c blindly printed the user-given alias (or nothing if there
hadn't been one) for the target table of INSERT/UPDATE/DELETE queries.
That works a large percentage of the time, but not always: for queries
appearing in WITH, it's possible that we chose a different alias to
avoid conflict with outer-scope names.  Since the chosen alias would
be used in any Var references to the target table, this'd lead to an
inconsistent printout with consequences such as dump/restore failures.

The correct logic for printing (or not) a relation alias was embedded
in get_from_clause_item.  Factor it out to a separate function so that
we don't need a jointree node to use it.  (Only a limited part of that
function can be reached from these new call sites, but this seems like
the cleanest non-duplicative factorization.)

In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql.

Initial report from Jonathan Katz; thanks to Vignesh C for analysis.
This has been broken for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
When evaluating clauses on multiple scan keys of a multi-column BRIN
index, we can stop processing as soon as we find a scan key eliminating
the range, and the range should not be added to tbe bitmap.

That's how it worked before 14, but since a681e3c the code treated
the range as matching if it matched at least the last scan key.

Backpatch to 14, where this code was introduced.

Backpatch-through: 14
Discussion: https://postgr.es/m/ebc18613-125e-60df-7520-fcbe0f9274fc%40enterprisedb.com
Failing to do so results in an error when a pgbench script tries to
start a serializable transaction inside a pipeline, because by the time
BEGIN ISOLATION LEVEL SERIALIZABLE is executed, we're already in a
transaction that has acquired a snapshot, so the server rightfully
complains.

We can work around that by preparing all commands in the pipeline before
actually starting the pipeline.  This changes the existing code in two
aspects: first, we now prepare each command individually at the point
where that command is about to be executed; previously, we would prepare
all commands in a script as soon as the first command of that script
would be executed.  It's hard to see that this would make much of a
difference (particularly since it only affects the first time to execute
each script in a client), but I didn't actually try to measure it.

Secondly, we no longer use PQsendPrepare() in pipeline mode, but only
PQprepare.  There's no specific reason for this change other than no
longer needing to do differently in pipeline mode.  (Previously we had
no choice, because in pipeline mode PQprepare could not be used.)

Backpatch to 14, where pgbench got support for pipeline mode.

Reported-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20210716153013.fc53b1c780b06fccc07a7f0d@sraoss.co.jp
SPI_result_code_string() was missing support for SPI_OK_TD_REGISTER,
and in v15 and later, it was missing support for SPI_OK_MERGE, as was
pltcl_process_SPI_result().

The last of those would trigger an error if a MERGE was executed from
PL/Tcl. The others seem fairly innocuous, but worth fixing.

Back-patch to all supported branches. Before v15, this is just adding
SPI_OK_TD_REGISTER to SPI_result_code_string(), which is unlikely to
be seen by anyone, but seems worth doing for completeness.

Reviewed by Tom Lane.

Discussion:
  https://postgr.es/m/CAEZATCUg8V%2BK%2BGcafOPqymxk84Y_prXgfe64PDoopjLFH6Z0Aw%40mail.gmail.com
  https://postgr.es/m/CAEZATCUMe%2B_KedPMM9AxKqm%3DSZogSxjUcrMe%2BsakusZh3BFcQw%40mail.gmail.com
Whe decoding a transactional logical message, logicalmsg_decode called
SnapBuildGetOrBuildSnapshot. But we may not have a consistent snapshot
yet at that point. We don't actually need the snapshot in this case
(during replay we'll have the snapshot from the transaction), so in
practice this is harmless. But in assert-enabled build this crashes.

Fixed by requesting the snapshot only in non-transactional case, where
we are guaranteed to have SNAPBUILD_CONSISTENT.

Backpatch to 11. The issue exists since 9.6.

Backpatch-through: 11
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/84d60912-6eab-9b84-5de3-41765a5449e8@enterprisedb.com
Multiple cycles of starting up and shutting down the plugin within a
single session would eventually lead to "out of relcache_callback_list
slots", because pgoutput_startup blindly re-registered its cache
callbacks each time.  Fix it to register them only once, as all other
users of cache callbacks already take care to do.

This has been broken all along, so back-patch to all supported branches.

Shi Yu

Discussion: https://postgr.es/m/OSZPR01MB631004A78D743D68921FFAD3FDA79@OSZPR01MB6310.jpnprd01.prod.outlook.com
If a rule action contains a subquery that refers to columns from OLD
or NEW, then those are really lateral references, and the planner will
complain if it sees such things in a subquery that isn't marked as
lateral. However, at rule-definition time, the user isn't required to
mark the subquery with LATERAL, and so it can fail when the rule is
used.

Fix this by marking such subqueries as lateral in the rewriter, at the
point where they're used.

Dean Rasheed and Tom Lane, per report from Alexander Lakhin.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/5e09da43-aaba-7ea7-0a51-a2eb981b058b%40gmail.com
It's been this way for a very long time, but it appears to have been
masking an issue that only manifests with different settings. Therefore,
run the tests in the installation's default encoding/locale.

Backpatch to all live branches.
postgres_fdw will close its remote session if an sinval cache reset
occurs, since it's possible that that means some FDW parameters
changed.  We had two tests that were trying to ensure that the
session remains alive by setting debug_discard_caches = 0; but
that's not sufficient.  Even though the tests seem stable enough
in the buildfarm, they flap a lot under CI.

In the first test, which is checking the ability to recover from
a lost connection, we can stabilize the results by just not
caring whether pg_terminate_backend() finds a victim backend.
If a reset did happen, there won't be a session to terminate
anymore, but the test can proceed anyway.  (Arguably, we are
then not testing the unintentional-disconnect case, but as long
as that scenario is exercised in most runs I think it's fine;
testing the reset-driven case is of value too.)

In the second test, which is trying to verify the application_name
displayed in pg_stat_activity by a remote session, we had a race
condition in that the remote session might go away before we can
fetch its pg_stat_activity entry.  We can close that race and make
the test more certainly test what it intends to by arranging things
so that the remote session itself fetches its pg_stat_activity entry
(based on PID rather than a somewhat-circular assumption about the
application name).

Both tests now demonstrably pass under debug_discard_caches = 1,
so we can remove that hack.

Back-patch into relevant back branches.

Discussion: https://postgr.es/m/20230226194340.u44bkfgyz64c67i6@awork3.anarazel.de
This is usually harmless, but if you were very unlucky it could
provoke a segfault due to the "to" string being right up against
the end of memory.  Found via valgrind testing (so we might've
found it earlier, except that our regression tests lacked any
exercise of translate()'s deletion feature).

Fix by switching the order of the test-for-end-of-string and
advance-pointer steps.  While here, compute "to_ptr + tolen"
just once.  (Smarter compilers might figure that out for
themselves, but let's just make sure.)

Report and fix by Daniil Anisimov, in bug #17816.

Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
Attempting to use this function with a raw page not coming from a GiST
index would cause a crash, as it was missing the same sanity checks as
gist_page_items_bytea().  This slightly refactors the code so as all the
basic validation checks for GiST pages are done in a single routine,
in the same fashion as the pageinspect functions for hash and BRIN.

This fixes an issue similar to 076f4d9.  A test is added to stress for
this case.  While on it, I have added a similar test for
brin_page_items() with a combination make of a valid GiST index and a
raw btree page.  This one was already protected, but it was not tested.

Reported-by: Egor Chindyaskin
Author: Dmitry Koval
Discussion: https://postgr.es/m/17815-fc4a2d3b74705703@postgresql.org
Backpatch-through: 14
1.  Make sure that we don't decrement SxactGlobalXminCount twice when
the SXACT_FLAG_RO_SAFE optimization is reached in a parallel query.
This could trigger a sanity check failure in assert builds.  Non-assert
builds recompute the count in SetNewSxactGlobalXmin(), so the problem
was hidden, explaining the lack of field reports.  Add a new isolation
test to exercise that case.

2.  Remove an assertion that the DOOMED flag can't be set on a partially
released SERIALIZABLEXACT.  Instead, ignore the flag (our transaction
was already determined to be read-only safe, and DOOMED is in fact set
during partial release, and there was already an assertion that it
wasn't set sooner).  Improve an existing isolation test so that it
reaches that case (previously it wasn't quite testing what it was
supposed to be testing; see discussion).

Back-patch to 12.  Bug #17116.  Defects in commit 47a338c.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
@reshke reshke changed the title 14 7 - 14 8 Backport 14 7 - 14 8 changes range in REL_2_STABLE Apr 27, 2026
If a view is defined atop another view, and then CREATE OR REPLACE
VIEW is used to add columns to the lower view, then when the upper
view's referencing RTE is expanded by ApplyRetrieveRule we will have
a subquery RTE with fewer eref->colnames than output columns.  This
confuses various code that assumes those lists are always in sync,
as they are in plain parser output.

We have seen such problems before (cf commit d5b760e), and now
I think the time has come to do what was speculated about in that
commit: let's make ApplyRetrieveRule synthesize some column names to
preserve the invariant that holds in parser output.  Otherwise we'll
be chasing this class of bugs indefinitely.  Moreover, it appears from
testing that this actually gives us better results in the test case
d5b760e added, and likely in other corner cases that we lack
coverage for.

In HEAD, I replaced d5b760e's hack to make expandRTE exit early with
an elog(ERROR) call, since the case is now presumably unreachable.
But it seems like changing that in back branches would bring more risk
than benefit, so there I just updated the comment.

Per bug #17811 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17811-d31686b78f0dffc9@postgresql.org
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants