fix: re-fetch PR details in enricher to avoid stale requested_reviewers#74
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
🛡️ Watchflow Governance ChecksStatus: ❌ 1 Violations Found 🟠 High Severity (1)When a PR modifies paths that have owners defined in CODEOWNERS, the corresponding code owners must be added as reviewersCode owners for modified paths must be added as reviewers: dkargatzis 💡 Reply with Thanks for using Watchflow! It's completely free for OSS and private repositories. You can also self-host it easily. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 56 minutes and 57 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/event_processors/pull_request/enricher.py (1)
65-66: Use structured warning logs on refresh fallback path.The fallback is correct, but this boundary log should include structured fields (
operation,subject_ids,decision,latency_ms) for observability and policy consistency.♻️ Proposed refactor
+from time import perf_counter ... - if pr_number and repo_full_name: + if pr_number and repo_full_name: + started_at = perf_counter() try: fresh_pr = await self.github_client.get_pull_request(repo_full_name, pr_number, installation_id) if fresh_pr: pr_data = fresh_pr except Exception as e: - logger.warning(f"Could not refresh PR #{pr_number} details: {e}") + latency_ms = int((perf_counter() - started_at) * 1000) + logger.warning( + "pull_request_refresh_failed", + extra={ + "operation": "refresh_pull_request", + "subject_ids": { + "repo_full_name": repo_full_name, + "pull_request_number": pr_number, + "installation_id": installation_id, + }, + "decision": "fallback_to_webhook_payload", + "latency_ms": latency_ms, + "error": str(e), + }, + )As per coding guidelines "Use structured logging at boundaries with fields:
operation,subject_ids,decision,latency_ms".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/event_processors/pull_request/enricher.py` around lines 65 - 66, The log in the exception handler should be converted to a structured warning containing the fields required by policy; replace the plain f-string in the except block with a structured logger.warning call that includes operation (e.g. "refresh_pr"), subject_ids (e.g. [pr_number]), decision (e.g. "refresh_fallback"), and latency_ms (compute from an existing start timestamp or calculate now - start_time in milliseconds), and include the exception under a field like "error" or "exception" instead of embedding it in the message; update the except block in enricher.py (the handler around logger.warning for PR refresh) to emit those fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/event_processors/pull_request/enricher.py`:
- Around line 65-66: The log in the exception handler should be converted to a
structured warning containing the fields required by policy; replace the plain
f-string in the except block with a structured logger.warning call that includes
operation (e.g. "refresh_pr"), subject_ids (e.g. [pr_number]), decision (e.g.
"refresh_fallback"), and latency_ms (compute from an existing start timestamp or
calculate now - start_time in milliseconds), and include the exception under a
field like "error" or "exception" instead of embedding it in the message; update
the except block in enricher.py (the handler around logger.warning for PR
refresh) to emit those fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d984d866-52ef-4df0-95f2-eecfc21894ea
📒 Files selected for processing (2)
CHANGELOG.mdsrc/event_processors/pull_request/enricher.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.12)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.py: Use modern typing only:dict[str, Any],list[str],str | None(noDict,List,Optional)
GitHub/HTTP/DB calls must beasync def; avoid blocking calls (time.sleep, sync HTTP) in async paths
All agent outputs and external payloads must use validatedBaseModelfrom Pydantic
Use dataclasses for internal immutable state where appropriate
Use structured logging at boundaries with fields:operation,subject_ids,decision,latency_ms
Implement Agent pattern: single-responsibility agents with typed inputs/outputs
Use Decorator pattern for retries, metrics, caching as cross-cutting concerns
Agent outputs must include:decision,confidence (0..1), shortreasoning,recommendations,strategy_used
Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5
Use minimal, step-driven prompts; provide Chain-of-Thought only for complexity > 0.7 or ambiguity > 0.6
Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)
Cache idempotent lookups; lazy-import heavy dependencies; bound fan-out withasyncio.Semaphore
Avoid redundant LLM calls; memoize per event when safe
Use domain errors (e.g.,AgentError) witherror_type,message,context,timestamp,retry_count
Use exponential backoff for transient failures; circuit-break noisy integrations when needed
Fail closed for risky decisions; provide actionable remediation in error paths
Validate all external inputs; verify webhook signatures
Implement prompt-injection hardening; sanitize repository content passed to LLMs
Performance targets: Static validation ~<100ms typical, hybrid decisions sub-second when cache warm, budget LLM paths thoughtfully
Reject old typing syntax (Dict,List,Optional) in code review
Reject blocking calls in async code; reject bareexcept:clauses; reject swallowed errors
Reject LLM calls for trivial/deterministic checks
Reject unvalidated agent outputs and missing confidenc...
Files:
src/event_processors/pull_request/enricher.py
🧠 Learnings (1)
📚 Learning: 2026-03-27T12:52:44.067Z
Learnt from: oleksii-quinta
Repo: warestack/watchflow PR: 67
File: src/webhooks/handlers/issue_comment.py:153-159
Timestamp: 2026-03-27T12:52:44.067Z
Learning: When enqueuing processor tasks using `task_queue`, follow the documented pre-built-task pattern: (1) build the task with `pre_built_task = task_queue.build_task(event_type, payload, processor.process, delivery_id=...)`; (2) call `task_queue.enqueue(processor.process, event_type, payload, pre_built_task, delivery_id=...)` by passing the pre-built task as a single positional `*args` element; (3) ensure the worker ultimately calls `await processor.process(pre_built_task)` (i.e., the processor `process(self, task: Task)` receives the `Task` instance). This matches the expectation that `enqueue` stores the pre-built task in the wrapper Task’s `args` as described by `build_task`’s docstring (“pass as single arg to enqueue”).
Applied to files:
src/event_processors/pull_request/enricher.py
🔇 Additional comments (2)
src/event_processors/pull_request/enricher.py (1)
60-65: Good root-cause fix for stale PR snapshots.Refreshing
pull_request_detailsat enrichment time (with fallback) correctly centralizes the race-condition mitigation and keeps downstream conditions consistent.CHANGELOG.md (1)
11-19: Changelog entry is clear and accurate.This documents the race condition, the refresh strategy, and fallback behavior in a way that aligns with the implementation and is useful for release tracking.
|
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (73.0%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage. @@ Coverage Diff @@
## main #74 +/- ##
=====================================
Coverage 73.0% 73.0%
=====================================
Files 181 181
Lines 13456 13481 +25
=====================================
+ Hits 9828 9851 +23
- Misses 3628 3630 +2 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
dkargatzis
left a comment
There was a problem hiding this comment.
Fix is correct and follows option 1 from issue #72.
Summary
Fixes a race where
PathHasCodeOwnerCondition/RequireCodeOwnerReviewersConditionfalse-positive when asynchronizewebhook is processed just before areview_requestedwebhook. The first webhook’spull_request.requested_reviewersis a point-in-time snapshot and doesn’t yet reflect the new request, so the engine flags “code owners missing” even though the owner has already been requested/approved.Closes #72
Approach
Option 1 from the issue — refresh at the source, not at each condition.
PullRequestEnricher.enrich_event_datanow callsgithub_client.get_pull_request(repo, pr_number, installation_id)before buildingevent_dataand uses the fresh response aspull_request_details. Every downstream condition automatically sees currentrequested_reviewers,requested_teams,assignees, labels, etc. — not just CODEOWNERS.If the refresh API call fails or returns
None, the enricher falls back to the webhook payload so we never degrade below the existing behavior.Why Option 1 over Option 2
Changes
src/event_processors/pull_request/enricher.py— refreshpr_dataviaget_pull_requestwith graceful fallbackCHANGELOG.md— entry under[Unreleased] ### FixedSummary by CodeRabbit
Bug Fixes