Skip to content

fix: re-fetch PR details in enricher to avoid stale requested_reviewers#74

Merged
dkargatzis merged 2 commits intowarestack:mainfrom
codesensei-tushar:fix/code-owner-reviewers-stale-check
Apr 16, 2026
Merged

fix: re-fetch PR details in enricher to avoid stale requested_reviewers#74
dkargatzis merged 2 commits intowarestack:mainfrom
codesensei-tushar:fix/code-owner-reviewers-stale-check

Conversation

@codesensei-tushar
Copy link
Copy Markdown
Contributor

@codesensei-tushar codesensei-tushar commented Apr 14, 2026

Summary

Fixes a race where PathHasCodeOwnerCondition / RequireCodeOwnerReviewersCondition false-positive when a synchronize webhook is processed just before a review_requested webhook. The first webhook’s pull_request.requested_reviewers is a point-in-time snapshot and doesn’t yet reflect the new request, so the engine flags “code owners missing” even though the owner has already been requested/approved.

Closes #72

Approach

Option 1 from the issue — refresh at the source, not at each condition.

PullRequestEnricher.enrich_event_data now calls github_client.get_pull_request(repo, pr_number, installation_id) before building event_data and uses the fresh response as pull_request_details. Every downstream condition automatically sees current requested_reviewers, requested_teams, assignees, labels, etc. — not just CODEOWNERS.

If the refresh API call fails or returns None, the enricher falls back to the webhook payload so we never degrade below the existing behavior.

Why Option 1 over Option 2

  • Single chokepoint — every condition reading PR fields benefits, not just CODEOWNERS
  • Fixes the root cause (stale payload) instead of patching one downstream symptom
  • No special-case logic in condition files

Changes

  • src/event_processors/pull_request/enricher.py — refresh pr_data via get_pull_request with graceful fallback
  • CHANGELOG.md — entry under [Unreleased] ### Fixed

Summary by CodeRabbit

Bug Fixes

  • Fixed a race condition where pull request conditions could be evaluated using stale webhook data. Pull request details are now automatically refreshed from GitHub before condition evaluation, ensuring the latest state is used during policy assessment, with safe fallback if the refresh fails.

@trunk-io
Copy link
Copy Markdown

trunk-io bot commented Apr 14, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@watchflow
Copy link
Copy Markdown

watchflow bot commented Apr 14, 2026

🛡️ Watchflow Governance Checks

Status: ❌ 1 Violations Found

🟠 High Severity (1)

When a PR modifies paths that have owners defined in CODEOWNERS, the corresponding code owners must be added as reviewers

Code owners for modified paths must be added as reviewers: dkargatzis
How to fix: Add the listed code owners as requested reviewers on the PR.


💡 Reply with @watchflow ack [reason] to override these rules, or @watchflow help for commands.

Thanks for using Watchflow! It's completely free for OSS and private repositories. You can also self-host it easily.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Warning

Rate limit exceeded

@codesensei-tushar has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 56 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: da73d1f0-4018-44a7-94d3-1ba81bf04ebc

📥 Commits

Reviewing files that changed from the base of the PR and between 663c203 and 3522656.

📒 Files selected for processing (1)
  • tests/unit/event_processors/pull_request/test_enricher.py
📝 Walkthrough

Walkthrough

The PullRequestEnricher now re-fetches the current pull request state from the GitHub API before constructing event data, overwriting webhook payload fields with fresh information. If the API call fails, it falls back to the original webhook data with a warning logged, addressing race conditions between webhook events.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md
Documents the new re-fetch behavior of PullRequestEnricher, noting that it overwrites point-in-time webhook fields (particularly requested_reviewers) and explains the fallback logic on API failure.
PR Enrichment Logic
src/event_processors/pull_request/enricher.py
Adds conditional API call to refresh pr_data via github_client.get_pull_request() when pr_number and repo_full_name are available, with exception handling that logs a warning and continues using existing data on failure.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • dkargatzis

Poem

🐰 The webhooks once arrived with tales out of date,
But now we hop swift to the API gate,
Fresh PR details erase the delay,
No stale requests blocking our way!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: re-fetching PR details in the enricher to fix stale requested_reviewers.
Linked Issues check ✅ Passed The PR fully implements Option 1 from issue #72: refreshing PR details via API call in the enricher with fallback to webhook payload, directly addressing race condition between synchronize and review_requested webhooks.
Out of Scope Changes check ✅ Passed All changes are in scope: enricher logic refresh and CHANGELOG documentation directly address the stale PR data race condition in issue #72.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/event_processors/pull_request/enricher.py (1)

65-66: Use structured warning logs on refresh fallback path.

The fallback is correct, but this boundary log should include structured fields (operation, subject_ids, decision, latency_ms) for observability and policy consistency.

♻️ Proposed refactor
+from time import perf_counter
...
-        if pr_number and repo_full_name:
+        if pr_number and repo_full_name:
+            started_at = perf_counter()
             try:
                 fresh_pr = await self.github_client.get_pull_request(repo_full_name, pr_number, installation_id)
                 if fresh_pr:
                     pr_data = fresh_pr
             except Exception as e:
-                logger.warning(f"Could not refresh PR #{pr_number} details: {e}")
+                latency_ms = int((perf_counter() - started_at) * 1000)
+                logger.warning(
+                    "pull_request_refresh_failed",
+                    extra={
+                        "operation": "refresh_pull_request",
+                        "subject_ids": {
+                            "repo_full_name": repo_full_name,
+                            "pull_request_number": pr_number,
+                            "installation_id": installation_id,
+                        },
+                        "decision": "fallback_to_webhook_payload",
+                        "latency_ms": latency_ms,
+                        "error": str(e),
+                    },
+                )

As per coding guidelines "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/pull_request/enricher.py` around lines 65 - 66, The log
in the exception handler should be converted to a structured warning containing
the fields required by policy; replace the plain f-string in the except block
with a structured logger.warning call that includes operation (e.g.
"refresh_pr"), subject_ids (e.g. [pr_number]), decision (e.g.
"refresh_fallback"), and latency_ms (compute from an existing start timestamp or
calculate now - start_time in milliseconds), and include the exception under a
field like "error" or "exception" instead of embedding it in the message; update
the except block in enricher.py (the handler around logger.warning for PR
refresh) to emit those fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/event_processors/pull_request/enricher.py`:
- Around line 65-66: The log in the exception handler should be converted to a
structured warning containing the fields required by policy; replace the plain
f-string in the except block with a structured logger.warning call that includes
operation (e.g. "refresh_pr"), subject_ids (e.g. [pr_number]), decision (e.g.
"refresh_fallback"), and latency_ms (compute from an existing start timestamp or
calculate now - start_time in milliseconds), and include the exception under a
field like "error" or "exception" instead of embedding it in the message; update
the except block in enricher.py (the handler around logger.warning for PR
refresh) to emit those fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d984d866-52ef-4df0-95f2-eecfc21894ea

📥 Commits

Reviewing files that changed from the base of the PR and between 5658ed3 and 663c203.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • src/event_processors/pull_request/enricher.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (3.12)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

**/*.py: Use modern typing only: dict[str, Any], list[str], str | None (no Dict, List, Optional)
GitHub/HTTP/DB calls must be async def; avoid blocking calls (time.sleep, sync HTTP) in async paths
All agent outputs and external payloads must use validated BaseModel from Pydantic
Use dataclasses for internal immutable state where appropriate
Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms
Implement Agent pattern: single-responsibility agents with typed inputs/outputs
Use Decorator pattern for retries, metrics, caching as cross-cutting concerns
Agent outputs must include: decision, confidence (0..1), short reasoning, recommendations, strategy_used
Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5
Use minimal, step-driven prompts; provide Chain-of-Thought only for complexity > 0.7 or ambiguity > 0.6
Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)
Cache idempotent lookups; lazy-import heavy dependencies; bound fan-out with asyncio.Semaphore
Avoid redundant LLM calls; memoize per event when safe
Use domain errors (e.g., AgentError) with error_type, message, context, timestamp, retry_count
Use exponential backoff for transient failures; circuit-break noisy integrations when needed
Fail closed for risky decisions; provide actionable remediation in error paths
Validate all external inputs; verify webhook signatures
Implement prompt-injection hardening; sanitize repository content passed to LLMs
Performance targets: Static validation ~<100ms typical, hybrid decisions sub-second when cache warm, budget LLM paths thoughtfully
Reject old typing syntax (Dict, List, Optional) in code review
Reject blocking calls in async code; reject bare except: clauses; reject swallowed errors
Reject LLM calls for trivial/deterministic checks
Reject unvalidated agent outputs and missing confidenc...

Files:

  • src/event_processors/pull_request/enricher.py
🧠 Learnings (1)
📚 Learning: 2026-03-27T12:52:44.067Z
Learnt from: oleksii-quinta
Repo: warestack/watchflow PR: 67
File: src/webhooks/handlers/issue_comment.py:153-159
Timestamp: 2026-03-27T12:52:44.067Z
Learning: When enqueuing processor tasks using `task_queue`, follow the documented pre-built-task pattern: (1) build the task with `pre_built_task = task_queue.build_task(event_type, payload, processor.process, delivery_id=...)`; (2) call `task_queue.enqueue(processor.process, event_type, payload, pre_built_task, delivery_id=...)` by passing the pre-built task as a single positional `*args` element; (3) ensure the worker ultimately calls `await processor.process(pre_built_task)` (i.e., the processor `process(self, task: Task)` receives the `Task` instance). This matches the expectation that `enqueue` stores the pre-built task in the wrapper Task’s `args` as described by `build_task`’s docstring (“pass as single arg to enqueue”).

Applied to files:

  • src/event_processors/pull_request/enricher.py
🔇 Additional comments (2)
src/event_processors/pull_request/enricher.py (1)

60-65: Good root-cause fix for stale PR snapshots.

Refreshing pull_request_details at enrichment time (with fallback) correctly centralizes the race-condition mitigation and keeps downstream conditions consistent.

CHANGELOG.md (1)

11-19: Changelog entry is clear and accurate.

This documents the race condition, the refresh strategy, and fallback behavior in a way that aligns with the implementation and is useful for release tracking.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 14, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.

❌ Your project status has failed because the head coverage (73.0%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@          Coverage Diff          @@
##            main     #74   +/-   ##
=====================================
  Coverage   73.0%   73.0%           
=====================================
  Files        181     181           
  Lines      13456   13481   +25     
=====================================
+ Hits        9828    9851   +23     
- Misses      3628    3630    +2     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5658ed3...3522656. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@dkargatzis dkargatzis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix is correct and follows option 1 from issue #72.

@dkargatzis dkargatzis merged commit 60ae336 into warestack:main Apr 16, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Watchflow Rules check returns stale PR data from webhooks

3 participants