Skip to content

[python] Make HTTP timeout/retry/keep-alive configurable via CatalogOptions#7732

Open
TheR1sing3un wants to merge 1 commit intoapache:masterfrom
TheR1sing3un:py-rest-http-config
Open

[python] Make HTTP timeout/retry/keep-alive configurable via CatalogOptions#7732
TheR1sing3un wants to merge 1 commit intoapache:masterfrom
TheR1sing3un:py-rest-http-config

Conversation

@TheR1sing3un
Copy link
Copy Markdown
Member

Purpose

The REST HttpClient currently hardcodes its retry count and ignores its own session timeout. Two practical issues fall out of this:

  1. Timeout is silently dropped. requests.Session.timeout is not consulted by the library — only session.request(timeout=...) is. The previous self.session.timeout = (180, 180) had no effect, so requests could hang indefinitely on a slow server.
  2. Retry counts are not tunable. Hardcoded max_retries=3 mixed connect / read retries; users could not disable connect retries (which often shouldn't retry) or boost read retries against flaky upstreams.

This PR introduces five CatalogOptions so REST behaviour can be tuned via standard catalog options:

Key Type Default Description
http.connect-timeout int (sec) 180 TCP connect timeout
http.read-timeout int (sec) 180 Response read timeout
http.max-connect-retries int 3 Retries for connect errors
http.max-read-retries int 3 Retries for read / status errors (429/502/503/504)
http.keep-alive bool true When false, sends Connection: close

HttpClient now accepts an optional Options argument, applies the timeout via session.request(timeout=...), separates connect / read retry counters in ExponentialRetry (with total=None so each type governs independently), and sets Connection: close when keep-alive is disabled. RESTApi forwards its options through. token_loader.py is updated to the new ExponentialRetry(connect_retries, read_retries) signature.

Linked issue

N/A — discovered while tuning REST clients against high-latency catalog servers and verifying that session.timeout is dead code.

Tests

  • pypaimon/tests/rest/test_exponential_retry_strategy.py — refreshed for the new signature; covers total=None, separated connect/read counters, zero-retries, and an end-to-end retry-on-connect-error case.
  • pypaimon/tests/rest/client_test.py — new HttpClientHttpOptionsTest:
    • Defaults applied when options=None
    • http.connect-timeout / http.read-timeout reach client._timeout
    • http.keep-alive=false sets Connection: close
    • http.max-connect-retries / http.max-read-retries reach the mounted adapter's Retry

Local: pytest pypaimon/tests/rest/client_test.py pypaimon/tests/rest/test_exponential_retry_strategy.py → 10 passed; flake8 --config=dev/cfg.ini clean.

API and format

HttpClient.__init__ adds an optional options kwarg (backward compatible — existing HttpClient(uri) callers still work and use the same default behaviour as before, except the timeout is now actually honoured).

ExponentialRetry.__init__ now takes connect_retries / read_retries instead of a single max_retries. The only internal caller (token_loader.py) is updated; no public API for ExponentialRetry.

No file format change.

Documentation

Option keys are self-described via with_description(...) in CatalogOptions. No additional doc change required.

Generative AI disclosure

Drafted with assistance from an AI coding tool; all logic reviewed by the author and validated by the tests above.

…ptions

The REST HttpClient hardcoded its retry count and ignored its own
session timeout (the requests library does not honour Session.timeout;
the timeout must be passed to session.request()). Introduce five
CatalogOptions so users can tune REST behaviour without monkey-patching:

  * http.connect-timeout       (int, default 180)
  * http.read-timeout          (int, default 180)
  * http.max-connect-retries   (int, default 3)
  * http.max-read-retries      (int, default 3)
  * http.keep-alive            (bool, default true)

HttpClient now accepts an optional Options argument, reads these keys,
applies the timeout via session.request(timeout=...), separates connect
and read retry counters in ExponentialRetry (total=None lets each type
of retry govern independently), and emits Connection: close when
keep-alive is disabled. RESTApi forwards its options through.

Update token_loader.py to use the new ExponentialRetry signature, and
add HttpClient option-coverage tests alongside the retry tests.
@JingsongLi
Copy link
Copy Markdown
Contributor

Why can't it be consistent with Java implementation?

@TheR1sing3un
Copy link
Copy Markdown
Member Author

Why can't it be consistent with Java implementation?

Thanks for the review @JingsongLi.

I do want to keep both the bug fix and the new options in this PR. The reason isn't just convenience — without these as catalog options the client's actual behaviour is not under our control. requests falls back to the OS / kernel for
things we can't choose at the application level: TCP connect retries (net.ipv4.tcp_syn_retries), keep-alive probe cadence, socket-level timeouts, etc. Today on the Python side self.session.timeout = (180, 180) is dead code
(Session.timeout isn't honoured), so:

  • connect/read timeouts are effectively unbounded — a slow or hung server can wedge the client indefinitely;
  • retry behaviour is whatever urllib3/the kernel decide, not what we promise.

Exposing the five keys (http.connect-timeout / http.read-timeout / http.max-connect-retries / http.max-read-retries / http.keep-alive) is what lets us pin those down deterministically per-catalog. The defaults already match Java's
hardcoded values (180s / 180s / 5 retries / keep-alive on), so out-of-the-box behaviour stays equivalent — only operators who today have no knob gain one.

On the Java consistency point: I take that seriously. My plan is, once this PR lands, to follow up with a dedicated PR that introduces the same option keys on the Java side (extending RESTCatalogOptions and wiring them through
HttpClientUtils), so the two SDKs converge on the same configuration surface. That follow-up needs its own Java-side validation (HttpClient5 test coverage etc.) which I'd rather not bundle into a Python PR.

If that two-step plan works for you, I'll keep this PR as-is. Otherwise happy to discuss alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants