fix(sync): Prevent indefinite hang on credential prompts and slow fetches#544
Draft
fix(sync): Prevent indefinite hang on credential prompts and slow fetches#544
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #544 +/- ##
==========================================
- Coverage 84.27% 83.61% -0.66%
==========================================
Files 29 29
Lines 3809 4114 +305
Branches 758 813 +55
==========================================
+ Hits 3210 3440 +230
- Misses 377 440 +63
- Partials 222 234 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… slow fetches why: vcspull sync hangs indefinitely when a repo needs credentials (git waits on stdin) or when fetching very large repos (e.g. 15GB rust-lang/rust). Sequential processing means one stuck repo blocks the entire batch. what: - Set GIT_TERMINAL_PROMPT=0 at sync() entry to make git fail fast on auth - Add 120s timeout to _maybe_fetch subprocess.run() call - Handle subprocess.TimeoutExpired with descriptive error message - Pass no-prompt env to _maybe_fetch subprocess calls - Add tests for timeout handling, env propagation, and prompt suppression Refs: #543
9fe9c4f to
3d5bafa
Compare
why: This branch needs both of the libvcs 0.40.0a0 sync-hang fixes to do its job: the `timeout=` kwarg on `libvcs._internal.run.run` (for per-subprocess deadlines) and the faster `GitSync.remote()` path that no longer enumerates every remote ref via `git remote show`. The pin is widened to allow any 0.40.x pre-release so test environments can pick up further patches without a follow-up bump. what: - Widen libvcs pin from `~=0.39.0` to `>=0.40.0a0,<0.41.0`. - Rename the libvcs pytest fixture type from `CreateRepoPytestFixtureFn` to the new `CreateRepoFn` alias across tests/test_sync.py and tests/test_cli.py. - Add a CHANGES entry describing the dependency bump and why it matters.
why: Before this change, a single stuck repository -- e.g. openai/codex under ~/study/rust/, which pipes thousands of remote-show lines through libvcs's busy-poll loop -- would block the entire batch of `vcspull sync` commands. The prior branch commit only armed git's credential-prompt env vars and timed out the *planning* fetch; the main `update_repo()` call still ran without a deadline. This wires a wall-clock watchdog around every repository so the batch keeps moving when one repo is slow or wedged, and surfaces an actionable rerun recipe so the user can diagnose without hunting for CLI flags. what: - Add `_SyncOutcome` + `_TimedOutRepo` dataclasses and a `_sync_repo_with_watchdog` primitive that runs `update_repo` on a single worker and enforces the timeout via `future.result(timeout=...)`. The executor is shut down with `wait=False` on every return path so a wedged libvcs subprocess is not allowed to block the main thread. - Add `--timeout SECONDS` CLI flag and `VCSPULL_SYNC_TIMEOUT_SECONDS` env override. Default is 10s, intentionally aggressive so hangs are surfaced fast. - Track timed-out repositories separately in the summary and emit a copy-pasteable rerun recipe grouped by workspace root. The suggested replacement timeout is `max(120, current * 10)` so a user who keeps the default gets a meaningful headroom on retry. - Plumb the new kwarg through `vcspull.cli.__init__:sync(...)`. - New `tests/cli/test_sync_watchdog.py` covering timeout resolution precedence, the synced/failed/timed-out outcome matrix, the rerun-recipe contents (workspace grouping, suggested timeout, verbose variant, manual git probe, log-file path), and the no-op branch when nothing timed out. - CHANGES entry under 1.59.x.
…facing why: When vcspull sync stalled on a repo and the watchdog abandoned it, the user had no post-mortem trail -- the only signal was the terminal line that said "timed out". npm/pnpm/yarn solve this by always writing a debug log to the temp dir and only surfacing the path on failure, so clean runs stay quiet but failed ones give the user something to grep. This wires the same pattern into vcspull sync. what: - Add `default_debug_log_path`, `setup_file_logger`, and `teardown_file_logger` in vcspull.log. The file handler attaches to both the vcspull and libvcs loggers at DEBUG level, so libvcs's per-repo progress output is captured alongside vcspull's own structured events. Duplicate registration is deduped by baseFilename so multiple setups for the same path do not write every record twice. - Default path is `$TMPDIR/vcspull-debug-<timestamp>-<pid>.log`, matching the npm/pnpm/yarn convention. - Add `--log-file PATH` and `--no-log-file` CLI flags. Thread both through from `vcspull.cli.__init__:sync(...)`. - Extract the existing body of `sync()` into `_sync_impl` so the file handler setup/teardown can wrap the whole sync in a `try/finally` and still cover every exit path (including argparse's SystemExit). - Surface the log path to the terminal on any failure (timeout block via `_emit_rerun_recipe`, and a new "Full debug log: ..." line for plain failures). Clean runs leave the file on disk but never mention it. - Tests in tests/cli/test_sync_log_file.py covering default path shape, attachment at DEBUG, end-to-end record capture, idempotent attachment, and clean teardown. - CHANGES entry under 1.59.x.
why: Before this change, users running `vcspull sync` had no live signal of which repository was currently processing -- the only output was a post-factum success/failure line per repo. When a slow or wedged repo was running, stdout just appeared frozen. That pattern is what led to the "codex is hanging, is it hung?" confusion that motivated this whole branch. Providing a lightweight indicator resolves that without reaching for a heavyweight dep like `rich`. what: - Add src/vcspull/cli/_progress.py with a threaded SyncStatusIndicator that uses only stdlib + ANSI escape codes (pattern borrowed from tmuxp's cli/_progress.py): a 100ms braille-frame spinner in a TTY, a single "· syncing <name>" line on start plus periodic "still syncing" lines in pipes. - atexit hook restores the cursor when the interpreter shuts down mid-sync so kitty/tmux don't leak a hidden cursor after a crash. - `build_indicator` disables the spinner for --json / --ndjson and for --color=never so machine-readable output stays clean. - Extract the inner repo loop of _sync_impl into _run_sync_loop so the indicator can be cleanly started, torn down via finally, and not leak into the summary emission. - Register the new module in tests/test_log.py expected CLI-logger list so get_cli_logger_names still matches. - Add tests/cli/test_sync_progress.py covering: disabled/no-op, start+ heartbeat, throttled heartbeat under interval, TTY spinner frames + cursor-hide/show, idempotent close, context-manager form, and build_indicator's mode-aware gating. - CHANGES entry under 1.59.x.
why: Two regressions in the post-summary output reported by the user: 1. The "Full debug log: ..." line printed twice whenever a batch had both timed-out repos AND plain failures -- _emit_rerun_recipe emitted it, and the separate "surface-on-failure" block after the summary emitted it again. Single source of truth beats two coordinated emitters. 2. The new timeout / rerun-recipe / spinner output used emoji-like glyphs (⏱ stopwatch, · middle dot, … ellipsis, Braille spinner frames). The user asked to avoid emojis while keeping the semantic colours, which already carry the warning signal on their own. what: - Remove the log-file emit from _emit_rerun_recipe. The sole emit site is now after _emit_summary, gated on summary['failed'] > 0 -- timeouts are already counted there so the hint fires exactly once on any failure. - Drop the now-unused log_file_path kwarg from _emit_rerun_recipe. Mirror the callers (exit-on-error branch in the sync loop + post-summary in _sync_impl). - Replace emoji/glyph decorations in the rerun recipe with ASCII: ⏱ -> plain "Timed out:" label (warning colour), -, ->. - Replace the in-loop timeout row prefix ⏱ -> - and → -> ->. - Progress indicator: swap Braille spinner frames (⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏) for ASCII |/-\; drop the · start-line prefix; replace … with ... in the spinner line and the non-TTY heartbeat. - Tests: drop the log-file-path assertion from the rerun-recipe test (the helper no longer takes the kwarg); add a regression guarding against the ⏱ glyph leaking back into the recipe; update the spinner test to assert ASCII frames and to forbid the previous Unicode decorations (Braille frames, ellipsis, middle dot).
why: Three regressions reported against the prior polish commit during a real `vcspull sync --workspace ~/study/rust --all` run: 1. Braille spinner was the right glyph set; `|/-\` was an over-correction of the "avoid emojis" feedback. The user pointed to tmuxp -- which also uses Braille -- as the model. 2. The spinner's \r redraws collided with libvcs's concurrent stdout writes (log lines via the `libvcs` StreamHandler, subprocess output via `progress_cb`). The cursor sat mid-spinner when those writes landed and produced artefacts like `Already on 'main'ec ... 1.1s`. 3. Ctrl-C during the batch was silently downgraded to a per-repo "failed" outcome (the watchdog caught `BaseException`), so the loop moved on to the next repo. Worse, `ThreadPoolExecutor` registers its workers in the atexit join list, so Python's interpreter shutdown blocked on the wedged libvcs subprocess thread -- a second Ctrl-C was needed and a `concurrent/futures/thread.py` traceback printed. what: - Restore `_SPINNER_FRAMES = "⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏"`. - Add `SyncStatusIndicator.write(text)` which, under the same lock the redraw loop holds, emits `\r\033[2K` before the caller's text when a spinner frame is in flight, then flushes. Zero `_last_line_len` so the next tick redraws cleanly. - Wrap each TTY frame in ANSI synchronized-output markers (`\x1b[?2026h` / `\x1b[?2026l`) so terminals flip the spinner line atomically; the bracket is ignored by terminals that don't implement the sequence. - Install a narrow stdout diverter while the spinner is enabled: `_IndicatorStreamProxy` adapts the `libvcs` and `vcspull` StreamHandlers to call `indicator.write(...)`. FileHandlers (the debug log) are left alone so `/tmp/vcspull-debug-*.log` keeps its full trace regardless. libvcs's `progress_callback` is rewired to `indicator.write(output)` for the duration of the loop. - Replace `concurrent.futures.ThreadPoolExecutor` with a raw daemon `threading.Thread` + `threading.Event` in `_sync_repo_with_watchdog`. Daemon threads are forcibly terminated at interpreter shutdown; no `_python_exit` join, no second-Ctrl-C traceback. Narrow the catch to `Exception` so `KeyboardInterrupt` and `SystemExit` propagate; the worker stashes a `BaseException` and the main thread re-raises it. - Top-level `KeyboardInterrupt` handler in `_sync_impl` prints "Interrupted by user.", emits the partial summary, surfaces the debug log path, and raises `SystemExit(130)` (conventional SIGINT exit code). - Tests: re-assert Braille frames and the synchronized-output escape sequences; add `test_write_clears_spinner_line_before_emitting_in_tty` as a regression for artefact 2; add `test_watchdog_propagates_keyboard_interrupt_from_worker` for the narrowed catch.
why: tmuxp's spinner renders each Braille frame through ``self.colors.info(frame)``, which reads brightly against the surrounding dim log stream without tinting the repo name / elapsed suffix. The user called that out as the model and the plain-ANSI version we'd shipped looked visually flat next to libvcs's coloured ``|git| (repo)`` lines. what: - Accept a ``Colors`` instance on ``SyncStatusIndicator`` (defaulting to a NEVER-colour palette so ad-hoc callers and tests don't leak ANSI codes). - Wrap only the Braille cell in ``self._colors.info(frame)`` inside ``_render_tty``; keep the ``syncing <name> ... 1.1s`` suffix in the terminal default so it doesn't compete with the permanent-line colours (✓ Synced green, - Timed out yellow). - Track visible-cell length (not escape-inclusive string length) in ``_last_line_len`` so the pad-clear arithmetic still aligns the next redraw even though the line has grown with ANSI bytes. - Thread the shared ``Colors`` from ``_sync_impl`` into ``build_indicator`` so the spinner honours ``--color=always/auto/never`` exactly like the rest of the CLI. - Test: ``test_tty_spinner_colours_the_frame_cell`` regex-matches an ANSI colour sequence immediately followed by a Braille frame, locking in the tmuxp-parity and guarding against future "colour the whole line" drift.
why: The prior interrupt handler lived inside ``_sync_impl`` wrapped around ``_run_sync_loop``. Ctrl-C landing BEFORE the loop -- notably during ``load_configs`` (libyaml parse on a large ~/.vcspull.yaml), or inside ``find_config_files``, or during the dry-run branch -- propagated all the way out as an unhandled ``KeyboardInterrupt`` and the user saw the full yaml parser stack dumped to the terminal. Observed in the reporter's chained ``vcspull sync; vcspull sync; ...`` invocation. what: - Add an outer ``except KeyboardInterrupt`` in ``sync()`` wrapping the existing ``try: _sync_impl(...) finally: teardown_file_logger(...)``. This handler fires only for interrupts outside the repo loop (the in-loop handler inside ``_sync_impl`` still runs first when applicable and emits the partial summary via its ``SystemExit(130)``, which is not caught here). - Handler writes "Interrupted by user." and the debug log path straight to ``sys.stderr`` -- the formatter may not yet exist, or may have been partially emitted, so a plain message is the safe contract. - Exits with ``SystemExit(130)`` (SIGINT convention) so shells / CI distinguish "user stopped this" from "crashed". - Regression test ``test_sync_handles_keyboard_interrupt_during_config_load`` monkeypatches ``load_configs`` to raise ``KeyboardInterrupt`` and asserts exit code 130 + stderr message.
why: bash/zsh abort ``cmd1; cmd2; cmd3`` sequential lists only when the child was killed by a signal (WIFSIGNALED(SIGINT)). A clean SystemExit(130) leaves the shell no reason to stop, so chained ``vcspull sync ...; vcspull sync ...`` required per-command Ctrl-C -- the reporter hit Ctrl-C four times in a four-invocation chain. Git solves this with ``sigchain_pop(sig); raise(sig)`` (sigchain.h:20-34, builtin/clone.c:416). Translate the pattern to Python so vcspull composes with ``;``-chains like git clone / git fetch. what: - Add ``_exit_on_sigint()``: POSIX does ``SIG_IGN`` (guard the gap) → ``SIG_DFL`` → ``signal.raise_signal(SIGINT)`` so the kernel exits with WIFSIGNALED(SIGINT); Windows falls back to ``SystemExit(130)``. Precondition: main thread (documented; the ValueError fails loud rather than silently defeating the change off-thread). - Add ``_SyncInterruptedAfterSummary(KeyboardInterrupt)`` marker so the outer ``sync()`` catch skips the plain stderr fallback when the inner ``_sync_impl`` already printed a coloured partial summary. - Inline log-file teardown in the outer ``except`` (atexit and ``finally`` do not run after ``SIG_DFL`` terminates at C level) and null the handle so the surrounding ``finally`` is a single-close no-op on the Windows fallback path. - Shield the inner ``if interrupted:`` summary emission against ``OSError`` / ``ValueError`` (e.g. ``BrokenPipeError`` when stdout is piped to ``head``) so the marker exception *always* propagates -- otherwise bash sees exit 1 and the ``;`` chain continues. - Tests: opt-in ``_fake_sigint_escalation`` fixture for in-process ``SystemExit(130)`` assertions (applied to the existing ``test_sync_handles_keyboard_interrupt_during_config_load``); mock-based unit tests for ``_exit_on_sigint`` POSIX ordering and Windows fallback; new subprocess-based ``test_exit_on_sigint_produces_wifsignaled_sigint`` asserting ``proc.returncode == -signal.SIGINT`` so the real WIFSIGNALED path is pinned down (POSIX only). - CHANGES entry under 1.59.x. Matches the design pass Claude/Gemini/GPT converged on via two rounds of weave:ask (see ``$XDG_STATE_HOME/ai-aip/repos/vcspull--*/sessions/ask/latest/``).
…ndler why: After shipping the npm-style debug log file, users saw libvcs's chatty ``|git| (repo) head_sha: ...`` / ``show_ref_output: ...`` / ``git_remote_name: ...`` lines on the terminal even at the default ``--log-level=INFO``. Root cause: ``setup_file_logger`` raises the ``vcspull`` and ``libvcs`` loggers to DEBUG so its FileHandler captures the full trace, but ``setup_logger`` left the terminal StreamHandlers with no per-handler level filter -- so a logger-level bump opened the floodgate to the terminal too. Reporter confirmed the same noise across 4 chained ``vcspull sync`` invocations today. what: - Add ``verbosity: int = 0`` kwarg to ``setup_logger`` (sync subcommand passes ``args.verbosity``; everything else gets 0). Drives the libvcs StreamHandler level via ``_libvcs_stream_level``: 0 → WARNING (chatty INFO + DEBUG suppressed); 1 (``-v``) → INFO; 2+ (``-vv``) → DEBUG. - Pin per-handler levels on every vcspull and libvcs StreamHandler. The per-handler filter is what protects the terminal from ``setup_file_logger`` raising the logger level to DEBUG; the file handler still receives DEBUG records because libvcs's *logger* level is DEBUG, but the terminal handler drops them via its own filter. - ``cli/__init__.py``: thread ``args.verbosity`` (defaults to 0 when the subcommand doesn't define it) into ``setup_logger``. - Tests: 5 new in ``tests/test_log.py``: default WARNING; ``-v`` INFO; ``-vv`` DEBUG; vcspull StreamHandler pinned to resolved level; ``setup_file_logger`` does NOT change StreamHandler levels (the floodgate regression guard).
why: Reporter wanted the *interesting* git stderr (``From <url>``,
``* [new branch]``, ``Already on 'main'``, ``Successfully rebased``)
visible while a repo is syncing, but NOT lingering in scrollback once
the repo completes. Today those lines stack as plain rows that the
``✓ Synced`` summary then has to compete with. tmuxp's
``before_script`` panel gives exactly the desired shape: a fixed-height
region above the spinner that rolls under as fresh lines arrive and
collapses on completion. Translate that pattern into vcspull's
``SyncStatusIndicator``.
what:
- Extend ``SyncStatusIndicator`` (``cli/_progress.py``) with an
``output_lines: int = 3`` constructor arg, a
``collections.deque(maxlen=output_lines)`` panel buffer, a
``_panel_visible_lines`` counter for redraw geometry, and a public
``add_output_line(text)`` API. ``0`` hides the panel and falls back
to plain ``write()``; ``-1`` is unbounded.
- Extend ``_render_tty`` to draw panel rows above the spinner inside
one synchronized-output bracket: walk the cursor up by
``_panel_visible_lines`` rows from the previous frame, erase + emit
each panel line, then write the spinner cell -- the whole frame is
one atomic write so concurrent ``add_output_line`` calls can't tear it.
- Add ``_clear_block`` for the spinner+panel teardown path used by
``stop_repo`` and ``close``: walks back up the rendered rows and
erases each, leaving the cursor on a clean row for the next
``formatter.emit_text("✓ Synced ...")``. ``write()`` (the log
diverter path) gains the same up-walk so a logger record clears the
panel along with the spinner before printing.
- ``start_repo`` clears the panel buffer (fresh trail per repo);
``stop_repo`` clears + collapses; ``close`` falls back to
``_clear_block`` so a Ctrl-C mid-repo doesn't leave the panel
sticky.
- ``add_output_line`` gates the panel path on ``self._tty``: in
non-TTY mode the spinner thread never starts to render the deque,
so the method falls back to plain ``write()``. Without this gate
the bytes would silently disappear under capsys / pipes / CI.
- ``cli/sync.py``: new ``--panel-lines N`` flag with metavar/help;
new ``_resolve_panel_lines`` helper (CLI > env > default), mirroring
``_resolve_repo_timeout``; thread the resolved value through
``sync()`` -> ``_sync_impl()`` -> ``build_indicator()``. Switch
``_indicator_progress`` from ``indicator.write(output)`` to
``indicator.add_output_line(output)`` so libvcs's chunked progress
callback feeds the deque instead of the raw stream.
- ``cli/__init__.py``: forward ``args.panel_lines`` (defaulting to
``None``) into ``sync(...)``.
- 11 new tests across ``tests/cli/test_sync_progress.py`` (deque
bounds, ``output_lines=0`` fallback, non-TTY fallback, atomic
panel render with synchronized-output bracket, ``stop_repo``
collapse, fresh-trail-per-repo) and ``tests/cli/test_sync_watchdog.py``
(5 ``_resolve_panel_lines`` precedence tests).
- ``CHANGES`` entry under 1.59.x.
…nner
why: Reporter showed two adjacent lines
✓ Synced fish-shell → ~/study/rust/fish-shell
⠸ syncing flume ... 0.9s
and called out two inconsistencies: the in-flight repo name (``flume``)
is plain text while the permanent line's name (``fish-shell``) is
``colors.info`` cyan, and the verb is lowercase ``syncing`` while the
permanent line uses uppercase ``Synced`` / ``Timed out``. Both lines are
status badges and should read as a family.
what:
- ``_render_tty``: wrap ``name`` in ``self._colors.info(name)`` so the
in-flight repo name matches the cyan ``colors.info(repo_name)`` used
on the permanent ``✓ Synced <name>`` line in ``cli/sync.py``.
- Capitalise the verb: ``Syncing`` (not ``syncing``) in the spinner
line, mirroring ``Synced``/``Timed out``.
- Non-TTY ``start_repo`` line gets the same capitalisation
(``Syncing <name>``). The ``... still syncing <name>`` heartbeat
stays lowercase since it's a sentence continuation after the
ellipsis, not a status badge.
- Tests updated accordingly: assert ``Syncing codex`` (capital) on
the start line; case-insensitive count for the heartbeat-throttle
test (``Syncing`` start + ``... still syncing`` heartbeat).
``test_tty_spinner_colours_the_frame_cell`` is renamed to
``test_tty_spinner_colours_the_frame_cell_and_name`` and now also
asserts a colour escape immediately precedes the repo name.
…r race why: Reporter saw a flicker (and occasionally a doubled ``✓ Synced clap`` line) when a repo finished. Two interacting causes: 1. ``_render_tty`` snapshotted state under ``_lock`` then released the lock before the actual ``stream.write``. A concurrent ``stop_repo`` could clear the screen between the snapshot and the write, letting the spinner thread paint a stale frame on top of the new state. Rich avoids this by holding ``self._lock`` across every refresh tick *and* re-checking ``done.is_set()`` inside the lock (``rich/live.py:_RefreshThread.run``); the same pattern applies here. 2. ``stop_repo`` cleared the spinner, then the caller printed the permanent line in a *separate* stream call. Even on a fast terminal the spinner row blinks to blank between the two writes, which reads as flicker. The atomic transition pattern (single ``\x1b[?2026h ... \x1b[?2026l`` block that erases the panel and replaces the spinner row with the permanent line) eliminates the flash. what: - ``_render_tty``: hold ``_lock`` from snapshot through the stream write. Re-check ``self._active_repo`` under the lock at the top -- if it changed (``stop_repo`` ran), skip the tick entirely so a stale frame never lands on screen. - ``stop_repo`` gains an optional ``final_line: str | None`` argument and now returns ``bool``. When the indicator is actively rendering on a TTY *and* a ``final_line`` is given, build one atomic ANSI block under the lock that walks up the panel rows, erases each going down, and replaces the spinner row with ``final_line + \n``. Returns ``True`` to tell the caller "I owned the print; skip your ``formatter.emit_text``" -- which is what kills the doubled ``✓ Synced clap`` artefact. - Headless / disabled-indicator paths still return ``False`` so the caller's ``formatter.emit_text`` keeps running unchanged. - ``cli/sync.py``: replace the ``with indicator.repo(...)`` context manager (which fired ``stop_repo()`` with no args before the outcome was known) with manual ``start_repo`` + post-outcome ``stop_repo(final_line=permanent if is_human else None)``. Skip the separate ``formatter.emit_text(permanent)`` when ``stop_repo`` reported it owned the print. Applies uniformly to the ``synced``, ``failed``, and ``timed_out`` branches. - ``cli/sync.py`` exception path: any ``BaseException`` (incl. the KeyboardInterrupt the SIGINT handler relies on) tears the indicator down with ``stop_repo()`` (no replacement line) and re-raises so the surrounding ``except KeyboardInterrupt`` in ``_sync_impl`` still owns the partial-summary print. - 4 new tests in ``tests/cli/test_sync_progress.py``: ``stop_repo(final_line=...)`` writes inside one synchronized-output bracket and returns ``True``; the no-arg path returns ``False``; non-TTY stays caller-owned even with a ``final_line``; ``_render_tty`` skips writes when the active repo was cleared (the lock-protected stale-tick guard).
why: Reporter saw blank-line gaps between successive ``✓ Synced ...`` lines when chaining ``vcspull sync --workspace ~/study/otel/ --all; vcspull sync --workspace ~/study/rust/ --all``. The panel feature added in 79fd4c3 promised "collapses when each repo finishes -- leaving only the permanent ``✓ Synced <name>`` line", but the implementation in d650c97 only erased panel-row content with ``\x1b[2K\n``. ``\x1b[2K`` clears the row's visible characters; the trailing ``\n`` advances the cursor down. After ``panel_visible`` iterations the cursor is back at the spinner row, but the panel rows above are blank-but-present in scrollback. Replacing the spinner with the permanent line then leaves orphan blank rows above it; the count varies per repo (whatever the previous repo's panel height was at completion time), exactly matching the reporter's "variable gaps" symptom. Fix: use DL (``\x1b[Pn M``, Delete Line) instead. After walking up to the first panel row, ``\x1b[{panel_visible}M`` deletes those rows from the buffer; subsequent rows scroll up to fill, so the spinner row's content lands at the cursor row. The follow-up ``erase + final_line + \n`` morphs that row into the permanent line in place. No orphan blanks. DL is a VT100 sequence supported by every terminal that implements the synchronized-output bracket already in use here. what: - ``stop_repo`` atomic transition (``_progress.py``): replace the ``\x1b[2K\n`` x panel_visible loop with a single ``\x1b[{panel_visible}M`` after the upward walk. Restructure the conditional so DL only fires when ``panel_visible > 0``. - ``_clear_block`` (``_progress.py``): same defect in the ``close()`` teardown path -- walked up + erased per row, leaving blanks. Replace with a single DL of ``panel + 1`` rows after walking up to the first panel row. The close path is preventive (normal operation hits ``stop_repo`` first and zeroes ``_panel_visible_lines``) but the parallel pattern was wrong for the same reason and would surface under a KeyboardInterrupt close-before-stop_repo. - 4 new regression tests in ``tests/cli/test_sync_progress.py``: DL escape lands when ``_panel_visible_lines > 0`` (with negative assertion that the regressed ``\x1b[2K\n`` per-row pattern is NOT present); DL omitted when zero panel rows; ``stop_repo()`` without ``final_line`` still DL's the panel; ``_clear_block`` deletes panel + spinner via one DL.
why: libvcs 0.40.0 was tagged and published to PyPI (2026-04-25T18:59Z, sha256 5f3340b6...). The earlier ``>=0.40.0a0`` pin was a placeholder while the prerelease was being prepared; with the stable release available, drop the ``a0`` suffix so the lower bound matches the actual minimum-supported version. what: - ``pyproject.toml``: ``libvcs>=0.40.0a0,<0.41.0`` -> ``libvcs>=0.40.0,<0.41.0``. - ``uv.lock``: relock to libvcs 0.40.0 from PyPI (sdist + wheel hashes match the published release). The lock carries ``[options.exclude-newer-package] libvcs = "2026-04-26..."`` to override the global 3-day cooldown configured in ``~/.config/uv/uv.toml`` -- libvcs 0.40.0 was published hours before this lock and would otherwise have been filtered out.
why: The ``### Dependencies`` entry in the unreleased section was written when the pin was ``>=0.40.0a0,<0.41.0``. With libvcs 0.40.0 released and the pin tightened to ``>=0.40.0,<0.41.0``, the changelog text now describes a stale bound; align it with reality so users reading the next release notes see the actual lower bound. what: - ``CHANGES``: ``Bump libvcs to >=0.40.0a0,<0.41.0...`` -> ``Bump libvcs to >=0.40.0,<0.41.0...``.
why: ``log.debug(NO_REPOS_FOR_TERM_MSG.format(name=search_term))``
runs ``str.format`` even when DEBUG is filtered out -- wasted work on
the hot ``vcspull sync`` path that resolves search terms for every
repo pattern. CLAUDE.md mandates lazy ``%``-style log args precisely
because eager formatting defeats the level filter.
what:
- ``cli/sync.py``: rewrite the debug call as
``log.debug('No repo found in config(s) for "%s"', search_term)``.
- The user-facing ``formatter.emit_text`` f-string a few lines below
is *not* a log call; leave it alone.
why: ``_install_indicator_log_diverter`` rebound ``handler.stream`` with a plain attribute assignment while the spinner thread and the worker thread were both emitting libvcs records through the same handler. ``Handler.handle`` (stdlib ``logging/__init__.py:1011``) takes ``self.lock`` around every ``emit()`` precisely so that modifications to the handler's mutable state can serialise with in-flight writes; an unlocked swap technically violates that contract. On CPython the GIL hides the race in practice (single STORE_ATTR is atomic, no torn pointer), but the contract is still wrong and would matter on a non-GIL'd build. what: - Acquire ``handler.lock`` (via ``handler.acquire()/release()``) around the install swap and the restore swap. Same lock the emit path holds. - Don't go through ``StreamHandler.setStream``: its first action is to ``flush()`` the existing stream, which raises ``ValueError`` when the current stream is a finalized pytest ``capsys``/``CaptureIO`` (a prior test ran while ``setup_logger`` captured ``sys.stdout``, the test ended, the CaptureIO closed, the handler still holds the dead reference, then this test calls ``setStream``). Skipping the flush is safe -- we aren't closing the stream, only redirecting future writes -- and matches the handler-internal locking the contract actually requires. - Inline the previous-stream capture (no longer needs the ``setStream`` return-value plumbing).
why: ``_NO_PROMPT_ENV`` was a module-global cache that snapshotted ``os.environ`` on first call. Tests using ``monkeypatch.setenv`` after the cache populated never saw the change in ``_get_no_prompt_env`` -- the subprocess inherited a stale env. The cache also has no performance justification: ``_maybe_fetch`` already runs ``subprocess.run`` (orders of magnitude more expensive than a dict copy). what: - ``cli/sync.py``: remove the ``_NO_PROMPT_ENV`` global and the ``global``-guard branch; build the env dict fresh on each ``_get_no_prompt_env`` call. - Updated docstring to call out the rationale (test isolation > 1us saved per fetch).
why: The help text said "increase plan verbosity (-vv for maximum detail)", which was accurate for the original dry-run plan flag but became misleading when ``setup_logger`` started using ``args.verbosity`` to drive the libvcs ``StreamHandler`` level (WARNING -> INFO -> DEBUG). A user reading ``--help`` to figure out why their terminal got chatty on ``-v`` had no breadcrumb pointing at the verbosity ladder. what: - ``cli/sync.py``: rewrite the ``--verbose`` help string to spell out the actual ladder (default WARNING; -v INFO; -vv DEBUG) and to keep the dry-run plan-detail behaviour mentioned for callers who relied on the old wording.
why: The rerun-recipe banners (``-> Rerun...``, ``-> Or with verbose...``,
``-> Probe each...``), the per-repo timed-out path arrow, and the
"Full debug log" hint all used ASCII ``->``, while the rest of the
file (``✓ Synced clap → ~/study/...``, ``Timed out fish → ~/...``)
uses the Unicode ``→``. Two ornaments coexisted in the same emit
stream, which read as visual drift.
what:
- ``cli/sync.py``: replace ``colors.info('->')`` and
``colors.muted('->')`` with the ``→`` form across all 7 surfaces
(4 in ``_emit_rerun_recipe``, 2 in the ``_sync_impl`` "Full debug
log" hints, 1 in the ``timed_out`` permanent line of
``_run_sync_loop``).
- The non-arrow ``colors.warning('-')`` bullet stays as-is -- it's a
list bullet, not a path-or-flow ornament.
why: ``add_output_line`` took ``self._lock`` and called ``text.splitlines()`` for every chunk libvcs delivered through the progress callback, including whitespace-only fragments (a stray ``\n`` between progress frames is common). The inner ``if stripped`` loop body would skip every fragment and the deque wouldn't change, so the lock acquire was pure overhead on a hot path that fires hundreds of times per repo. what: - ``cli/_progress.py``: after the panel-disabled fallback (which *should* still surface whitespace via ``self.write``), early-return when ``text.strip()`` is empty. The fallback path comes first so panel=0 / non-TTY / JSON callers still see the bytes.
why: CLAUDE.md requires working doctests on every function and method, both as documentation and as runnable examples. The sync UX overhaul introduced public-ish helpers without doctest coverage, leaving the contract for ``--panel-lines`` / ``--timeout`` resolution and the ``SyncStatusIndicator`` lifecycle described only in prose. what: - ``cli/sync.py``: add doctests on ``_resolve_repo_timeout`` (default fallback when CLI value is ``None``/``0``/negative; CLI override passes through) and ``_resolve_panel_lines`` (default 3; ``0`` hides; ``-1`` unbounded; explicit values pass through). Each doctest pops the relevant ``VCSPULL_*`` env var first so a stray export in the caller's shell doesn't flake the example. - ``log.py``: add a doctest on ``default_debug_log_path`` asserting the ``vcspull-debug-`` prefix, ``.log`` suffix, and that the parent is the system tempdir. - ``cli/_progress.py``: doctests on ``SyncStatusIndicator.start_repo``, ``stop_repo``, and ``add_output_line``. Each builds a non-TTY (or TTY-with-StringIO) indicator so the doctest prints nothing visible yet still exercises the state machine. ``add_output_line`` got an ``r"""`` prefix because the example payloads contain ``\n`` literals (D301). - Test count goes from 1018 to 1024 (+6 doctests).
why: ``--panel-lines`` uses ``-1`` as the unbounded sentinel and ``0`` as the hide sentinel; any other negative was meaningless but ``_resolve_panel_lines`` silently coerced -2 / -7 / -42 to "unbounded" via the permissive ``elif output_lines < 0:`` branch in ``SyncStatusIndicator.__init__``. A user typing ``--panel-lines -3`` expecting "3 lines" would get an unbounded panel with no warning. typer/click validate at parse time precisely so user typos surface where the user can fix them. what: - ``cli/sync.py``: add ``_panel_lines_arg(value: str) -> int``, an argparse ``type=`` callable that accepts -1, 0, or any positive integer; raises ``argparse.ArgumentTypeError`` otherwise. Includes a doctest covering happy-path and both rejection cases (non-int string and out-of-range). - ``--panel-lines`` argparse entry uses ``type=_panel_lines_arg`` in place of plain ``type=int``. - The ``VCSPULL_PROGRESS_LINES`` env-var path keeps its existing log-warning + fallback behaviour: env vars are trusted less strictly than CLI args (typos there are usually copy/paste, not intent), and a hard error on a stray export would lock users out of running ``vcspull sync`` until they unset it.
why: Zero or negative ``--timeout`` is meaningless: the watchdog would either return immediately (deadline already passed) or never (logically invalid). The earlier permissive ``_resolve_repo_timeout`` silently fell back to the default on these inputs, masking typos like ``--timeout -10`` (intended ``10``) where the user assumed the flag took effect. typer/click validate at parse time precisely so user typos surface where the user can fix them. what: - ``cli/sync.py``: add ``_positive_int_arg(value: str) -> int``, an argparse ``type=`` callable that requires a positive integer; raises ``argparse.ArgumentTypeError`` on zero, negative, or non-numeric. Includes a doctest covering the happy path and three rejection branches. - ``--timeout`` argparse entry uses ``type=_positive_int_arg`` in place of plain ``type=int``. - ``_resolve_repo_timeout`` keeps its programmatic-caller fallback for ``cli_timeout <= 0``; doctest now notes the CLI surface is guarded by ``_positive_int_arg`` at parse time. - ``VCSPULL_SYNC_TIMEOUT_SECONDS`` env path keeps log-warning + fallback (env vars are trusted less strictly than CLI args; a hard error on a stale export would lock users out).
why: ``setup_file_logger`` raises both the ``vcspull`` and ``libvcs`` *logger* levels to DEBUG so its FileHandler captures the full trace for post-mortem. That bump would ordinarily open the same DEBUG records to every other handler attached to those loggers (notably the terminal ``StreamHandler``), but ``setup_logger`` pins per-handler levels on every StreamHandler, which absorbs the bump without leaking. The whole "libvcs DEBUG floodgate" regression that ``89649a86`` fixed was this invariant breaking. The code that relies on it had no inline note, so a future "simplification" that drops the per-handler pinning would silently re-open the floodgate and the only signal would be the regression test going red. what: - ``log.py``: add an inline comment at the ``logger.setLevel(level)`` bump in ``setup_file_logger`` explaining the precondition (the per-handler levels in ``setup_logger``) and naming the regression test (``test_setup_file_logger_does_not_open_stream_floodgate``) that catches it.
why: The marker subclass exists so the outer ``sync()`` ``except KeyboardInterrupt`` catch can distinguish "inner layer already printed a coloured partial summary, don't repeat it" from "Ctrl-C hit before the formatter was even built; print the plain-stderr fallback". The contract works because the only ``except KeyboardInterrupt`` between the inner ``_sync_impl`` raise and process exit is :func:`sync`'s, which subclass-checks. A future broader catch wrapping ``cli()`` would need to either re-raise the subclass unchanged or perform the same isinstance check -- otherwise the marker is swallowed, the inner partial-summary prints, AND the outer fallback prints, restoring the double-print artefact the marker was introduced to suppress. The raise site and the catch site each had no inline note. what: - ``cli/sync.py``: extend ``_SyncInterruptedAfterSummary``'s docstring with an explicit "Invariant" section spelling out the "only catcher is sync()" precondition. - Add a comment near the inner-layer ``raise _SyncInterruptedAfterSummary from None`` in ``_sync_impl`` pointing back at the docstring so the raise site is also self-documenting.
why: Reporter saw libvcs's ``|git| (rye) Failed to determine current branch`` warning leak in alongside vcspull's own ``✗ Failed syncing rye: Command failed with code 128: git symbolic-ref HEAD --short`` line -- two messages saying the same thing, breaking the ``✓ Synced X / ✗ Failed X / - Timed out X`` rhythm of the rest of the output. libvcs's WARNING/ERROR records emit through the StreamHandler at default verbosity (WARNING), and the indicator's stream-diverter only redirected the bytes -- it didn't filter them out -- so the libvcs prologue still printed before the vcspull summary. what: - ``cli/sync.py``: ``_install_indicator_log_diverter`` now also raises the libvcs StreamHandler level to ``logging.CRITICAL + 1`` for the duration of the sync, but only when the level is at WARNING (default verbosity). 51 sits one above ``CRITICAL`` so no standard log record reaches the handler. The ``FileHandler`` is left untouched and still captures everything at DEBUG, so a post-mortem in the debug log still shows the libvcs entry. Users at ``-v`` (libvcs INFO) or ``-vv`` (libvcs DEBUG) opted into seeing libvcs activity; the diverter leaves their level alone. - The restore closure now also walks the saved levels and resets them, mirroring the stream restore. - ``tests/test_log.py``: two regression tests next to ``test_setup_file_logger_does_not_open_stream_floodgate`` (same shape: handler-level invariant under condition X). Asserts the bump from WARNING to ``CRITICAL + 1`` and the restore at default verbosity, plus the no-op when the user is at ``-v``. - vcspull's StreamHandler is intentionally untouched: its WARNING records (e.g. ``Could not open debug log at ...``) all fire *before* the indicator is installed, but if any future call site emits a vcspull warning during the loop the user should still see it.
why: ``default_debug_log_path`` was dropping debug logs as bare files in the system tempdir (``/tmp/vcspull-debug-<ts>-<pid>.log``). The ``vcspull-`` filename prefix gets lost in a busy ``/tmp/`` next to ``vscode-IPC-...``, ``systemd-private-...``, ``kotlin-idea-...``, etc., and there is no single dir to ``rm -rf`` to clean up old vcspull logs. Reporter wanted the path grouped under a self-describing directory: ``/tmp/vcspull/`` for production runs, ``/tmp/vcspull-test/`` for tests, so a glance at ``/tmp/`` says what's whose. what: - ``log.py``: ``default_debug_log_path`` now returns ``<tempdir>/vcspull/debug-<ts>-<pid>.log`` (the ``vcspull-`` prefix on the file is redundant once the directory carries the label). - Auto-detect pytest via ``PYTEST_CURRENT_TEST`` and route to ``<tempdir>/vcspull-test/`` instead. This is a safety net for tests that incidentally call ``default_debug_log_path`` without first redirecting ``TMPDIR`` -- they'll never pollute the production ``vcspull/`` dir even when the test author forgets to monkeypatch. ``PYTEST_CURRENT_TEST`` is set by pytest for the entire test session (collection + execution + doctest), so the doctest assertion also picks the test dir. - ``cli/sync.py``: ``--log-file`` help text mentions both production and test default paths. - ``tests/cli/test_sync_log_file.py``: update the two tests asserting on the file name prefix to match the new ``debug-<ts>-<pid>.log`` pattern. ``test_default_debug_log_path_lives_in_tempdir`` also asserts the parent dir name is ``vcspull-test`` (since that's what pytest sees), pinning the auto-detect. - The doctest on ``default_debug_log_path`` accepts either ``vcspull`` or ``vcspull-test`` so the example reads correctly whether it runs under pytest or in a Python repl. - Behaviour is wholly additive: no callers change. Existing debug logs from previous vcspull runs at ``/tmp/vcspull-debug-...log`` remain on disk; users can clean them by hand once.
why: Two user-visible behaviours landed without CHANGES updates: the libvcs WARNING/ERROR silencing during human-mode sync (so the ``|git| (rye) Failed to determine ...`` line no longer competes with ``✗ Failed syncing rye: ...``), and the debug-log directory regrouping under ``$TMPDIR/vcspull/`` (and ``$TMPDIR/vcspull-test/`` under pytest). Releasing without these would leave the changelog describing behaviour that no longer matches reality. what: - ``CHANGES``: extend the existing "libvcs is quiet by default" unreleased section with a paragraph spelling out the sync-time StreamHandler silencing -- name the rye repro because that is the symptom users will recognise -- and stress that the debug log file still captures the entry for post-mortem. - ``CHANGES``: rewrite the debug-log section to advertise the new ``$TMPDIR/vcspull/debug-<ts>-<pid>.log`` path (production) and the ``$TMPDIR/vcspull-test/...`` test-mode safety net, replacing the stale ``$TMPDIR/vcspull-debug-...`` description.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GIT_TERMINAL_PROMPT=0atsync()entry so git fails fast instead of blocking on stdin when credentials are needed_maybe_fetch()subprocess call to prevent indefinite blocking on slow/large repo fetchessubprocess.TimeoutExpiredwith a descriptive error messageFixes #543
Context
vcspull sync --workspace <path> --allhangs indefinitely when any repo in the batch triggers a git credential prompt (SSH passphrase, HTTPS password) or when fetching a very large repo (e.g. 15GBrust-lang/rust). The sync loop is sequential, so one stuckgit fetchblocks all remaining repos with no timeout or recovery.Test plan
TimeoutExpiredhandling in_maybe_fetchGIT_TERMINAL_PROMPT=0is passed in subprocess envtimeoutkwarg is passed to subprocesssync()setsGIT_TERMINAL_PROMPT=0viaos.environ.setdefault