[python] Make HTTP timeout/retry/keep-alive configurable via CatalogOptions by TheR1sing3un · Pull Request #7732 · apache/paimon

TheR1sing3un · 2026-04-29T10:30:43Z

Purpose

The REST HttpClient currently hardcodes its retry count and ignores its own session timeout. Two practical issues fall out of this:

Timeout is silently dropped. requests.Session.timeout is not consulted by the library — only session.request(timeout=...) is. The previous self.session.timeout = (180, 180) had no effect, so requests could hang indefinitely on a slow server.
Retry counts are not tunable. Hardcoded max_retries=3 mixed connect / read retries; users could not disable connect retries (which often shouldn't retry) or boost read retries against flaky upstreams.

This PR introduces five CatalogOptions so REST behaviour can be tuned via standard catalog options:

Key	Type	Default	Description
`http.connect-timeout`	int (sec)	180	TCP connect timeout
`http.read-timeout`	int (sec)	180	Response read timeout
`http.max-connect-retries`	int	3	Retries for connect errors
`http.max-read-retries`	int	3	Retries for read / status errors (429/502/503/504)
`http.keep-alive`	bool	true	When false, sends `Connection: close`

HttpClient now accepts an optional Options argument, applies the timeout via session.request(timeout=...), separates connect / read retry counters in ExponentialRetry (with total=None so each type governs independently), and sets Connection: close when keep-alive is disabled. RESTApi forwards its options through. token_loader.py is updated to the new ExponentialRetry(connect_retries, read_retries) signature.

Linked issue

N/A — discovered while tuning REST clients against high-latency catalog servers and verifying that session.timeout is dead code.

Tests

pypaimon/tests/rest/test_exponential_retry_strategy.py — refreshed for the new signature; covers total=None, separated connect/read counters, zero-retries, and an end-to-end retry-on-connect-error case.
pypaimon/tests/rest/client_test.py — new HttpClientHttpOptionsTest:
- Defaults applied when options=None
- http.connect-timeout / http.read-timeout reach client._timeout
- http.keep-alive=false sets Connection: close
- http.max-connect-retries / http.max-read-retries reach the mounted adapter's Retry

Local: pytest pypaimon/tests/rest/client_test.py pypaimon/tests/rest/test_exponential_retry_strategy.py → 10 passed; flake8 --config=dev/cfg.ini clean.

API and format

HttpClient.__init__ adds an optional options kwarg (backward compatible — existing HttpClient(uri) callers still work and use the same default behaviour as before, except the timeout is now actually honoured).

ExponentialRetry.__init__ now takes connect_retries / read_retries instead of a single max_retries. The only internal caller (token_loader.py) is updated; no public API for ExponentialRetry.

No file format change.

Documentation

Option keys are self-described via with_description(...) in CatalogOptions. No additional doc change required.

Generative AI disclosure

Drafted with assistance from an AI coding tool; all logic reviewed by the author and validated by the tests above.

…ptions The REST HttpClient hardcoded its retry count and ignored its own session timeout (the requests library does not honour Session.timeout; the timeout must be passed to session.request()). Introduce five CatalogOptions so users can tune REST behaviour without monkey-patching: * http.connect-timeout (int, default 180) * http.read-timeout (int, default 180) * http.max-connect-retries (int, default 3) * http.max-read-retries (int, default 3) * http.keep-alive (bool, default true) HttpClient now accepts an optional Options argument, reads these keys, applies the timeout via session.request(timeout=...), separates connect and read retry counters in ExponentialRetry (total=None lets each type of retry govern independently), and emits Connection: close when keep-alive is disabled. RESTApi forwards its options through. Update token_loader.py to use the new ExponentialRetry signature, and add HttpClient option-coverage tests alongside the retry tests.

JingsongLi · 2026-04-29T14:53:39Z

Why can't it be consistent with Java implementation?

TheR1sing3un · 2026-04-29T15:14:08Z

Why can't it be consistent with Java implementation?

Thanks for the review @JingsongLi.

I do want to keep both the bug fix and the new options in this PR. The reason isn't just convenience — without these as catalog options the client's actual behaviour is not under our control. requests falls back to the OS / kernel for
things we can't choose at the application level: TCP connect retries (net.ipv4.tcp_syn_retries), keep-alive probe cadence, socket-level timeouts, etc. Today on the Python side self.session.timeout = (180, 180) is dead code
(Session.timeout isn't honoured), so:

connect/read timeouts are effectively unbounded — a slow or hung server can wedge the client indefinitely;
retry behaviour is whatever urllib3/the kernel decide, not what we promise.

Exposing the five keys (http.connect-timeout / http.read-timeout / http.max-connect-retries / http.max-read-retries / http.keep-alive) is what lets us pin those down deterministically per-catalog. The defaults already match Java's
hardcoded values (180s / 180s / 5 retries / keep-alive on), so out-of-the-box behaviour stays equivalent — only operators who today have no knob gain one.

On the Java consistency point: I take that seriously. My plan is, once this PR lands, to follow up with a dedicated PR that introduces the same option keys on the Java side (extending RESTCatalogOptions and wiring them through
HttpClientUtils), so the two SDKs converge on the same configuration surface. That follow-up needs its own Java-side validation (HttpClient5 test coverage etc.) which I'd rather not bundle into a Python PR.

If that two-step plan works for you, I'll keep this PR as-is. Otherwise happy to discuss alternatives.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Make HTTP timeout/retry/keep-alive configurable via CatalogOptions#7732

[python] Make HTTP timeout/retry/keep-alive configurable via CatalogOptions#7732
TheR1sing3un wants to merge 1 commit intoapache:masterfrom
TheR1sing3un:py-rest-http-config

TheR1sing3un commented Apr 29, 2026

Uh oh!

JingsongLi commented Apr 29, 2026

Uh oh!

TheR1sing3un commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TheR1sing3un commented Apr 29, 2026

Purpose

Linked issue

Tests

API and format

Documentation

Generative AI disclosure

Uh oh!

JingsongLi commented Apr 29, 2026

Uh oh!

TheR1sing3un commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants