Skip to content

feat: add triage-issues + triage-board skills (+ visual-test hardening)#36012

Open
tudorpopams wants to merge 8 commits intomicrosoft:masterfrom
tudorpopams:feat/triage-issues-skill
Open

feat: add triage-issues + triage-board skills (+ visual-test hardening)#36012
tudorpopams wants to merge 8 commits intomicrosoft:masterfrom
tudorpopams:feat/triage-issues-skill

Conversation

@tudorpopams
Copy link
Copy Markdown
Contributor

@tudorpopams tudorpopams commented Apr 20, 2026

Summary

Adds two new agent skills for triage workflows and hardens an existing one along the way.

triage-issues (repo-level Shield queue)

Walks the Needs: Triage :mag: queue on this repo, classifies each issue against the Shield triage decision tree, and recommends labels, assignees, and comments — applying via gh only after the human approves.

  • Proactive repro validation with playwright-cli: during classification the skill flags bug reports as validation candidates (has a sandbox + headless-observable + not perf/env/AT-dependent) and proposes them to the user. Resolution: Can't Repro is surfaced as a candidate only — never auto-applied.
  • v9 feature requests that cite v8 behavior trigger a documented-composition-pattern investigation (Field, react-motion-components-preview, useAnnounce, etc.) and default to Resolution: By Design when v9 already addresses the ask — avoids backlog pollution.
  • Shield: P1 and Partner Ask described as signal-based (critical-regression evidence, tracked-workstream context) rather than identity-based tiering. External community reports go through the same triage path as any other issue.

triage-board (Fluent Unified project board)

Triages items on the org-level GitHub Project at microsoft/projects/395. Deliberately distinct from triage-issues — sets the board's Team single-select field (cxe-prg / cxe-red / teams-prg / cxe-coastal / v-build / xc-uxe / fluentui-motion / …) and adds a GitHub-issue assignee from CODEOWNERS. Does not touch labels or Status.

  • Cross-repo: handles microsoft/fluentui, microsoft/fluentui-system-icons, and microsoft/fluentui-contrib.
  • Filter mirrors the board's canonical view 6 (By team): excludes Resolution: Soft Close, Type: Epic, Help Wanted ✨, Needs: Triage :mag:, Status=Done, PRs, and closed state.
  • Routes via CODEOWNERS with a confident-vs-ambiguous mapping table. Ambiguous handles (charting-team, northstar, etc.) are flagged for human review rather than auto-routed.
  • Rules codified from the first live run: v9 issues never route to cxe-red (rerouted to cxe-prg or next-best team); CODEOWNERS dual-ownership lines picked the first team and flag the rest for human confirmation.
  • Two-part preflight catches both account/permission problems (EMU tokens can read but not mutate microsoft/*) and the separate project OAuth scope required for updateProjectV2ItemFieldValue.

visual-test hardening

Two fixes discovered while validating PR #36013 against a fresh-ish workspace:

  • Port detection. Replaced the old "lowest-numbered listening port on the first process matching 'storybook dev'" heuristic with a more robust pattern: pgrep -f "node.*\.bin/storybook dev" targets only the real storybook child (not the yarn wrapper), then probe each listening socket and pick the one whose Content-Type is text/html. Added an additional wait loop for index.json to populate so the caller doesn't race story compilation.
  • Unstable-deps gotcha. Per-component Storybooks fail to compile with Module not found: @fluentui/react-alert (and react-infobutton / react-virtualizer) on a fresh clone because those three packages are workspace-linked from local sources whose lib-commonjs/ is only produced on build. Documented the one-shot yarn nx run-many -t build -p react-alert,react-infobutton,react-virtualizer fix in both the visual-test troubleshooting section and docs/workflows/contributing.md so it doesn't have to be rediscovered.

What's in the commits

  • triage-issues/ — workflow, decision rules, gh commands, proactive-validation proposal, recommend-then-apply gate, references for shield guidelines / labels / partner signals
  • triage-board/ — workflow, cross-repo CODEOWNERS-driven routing, view-6 filter, two-part auth preflight, GraphQL snippets for field mutations
  • visual-test/SKILL.md — hardened port detection + Module not found troubleshooting
  • docs/workflows/contributing.md — new "First-time Storybook setup" note
  • AGENTS.md — skill registry updated with both new skills

Operational specifics that don't belong in a public repo (the tracked partner-workstream list and known reporter handles used by triage-issues) are intentionally kept out of this PR; they live in private maintainer memory instead. Community contributors shouldn't have to read a tiering list to understand the triage process.

Test plan

  • triage-issues runs end-to-end against the live Needs: Triage :mag: queue (9 issues initially, triaged across several sessions; labels verified against live repo labels API)
  • triage-issues approval gate: skill does not mutate any issue without explicit user approval, including during validation
  • triage-issues v9-investigation step correctly identifies Resolution: By Design for v8→v9 feature asks where composition addresses the need (verified on 7 AMC-filed feature requests)
  • triage-board runs end-to-end on the live board (30+ items classified and applied across 3 repos, including the soft-close/view-6-filter correction cycle)
  • triage-board correctly reverts a wrong apply via clearProjectV2ItemFieldValue
  • visual-test hardened port detection validated against live per-component Storybook for both react-combobox and react-tooltip
  • Second triager reviews the reference docs for tone and accuracy before this PR leaves draft

🤖 Generated with Claude Code

Introduces a new agent skill that walks the `Needs: Triage 🔍` queue
on microsoft/fluentui, classifies each issue against the Shield triage
decision tree, and recommends labels, assignees, and comments before
applying any changes via the `gh` CLI.

The skill operates in recommend-then-apply mode: the LLM never mutates
issues until the human has approved the batch. For feature requests
that cite v8 behavior, the skill is instructed to investigate v9's
documented composition patterns (Field, react-motion-components-preview,
useAnnounce, etc.) and default to `Resolution: By Design` when a v9
pattern already addresses the ask — avoiding backlog pollution.

Reference docs intentionally describe `Shield: P1` and `Partner Ask`
as signal-based decisions (critical-regression evidence, tracked
workstream context) rather than identity-based tiering, so external
community reports go through the same triage path as any other issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

📊 Bundle size report

✅ No changes found

@github-actions
Copy link
Copy Markdown

Pull request demo site: URL

tudorpopams and others added 5 commits April 20, 2026 18:28
Adds a Step 3.5 to the triage-issues skill that lets the human ask
the skill to validate specific issues' reproductions with playwright-cli
before approving triage. Reuses the install pattern from the
visual-test skill.

The validation pass visits the reporter's StackBlitz/CodeSandbox (or
spins up local Storybook when no sandbox is provided), captures a
screenshot + DOM snapshot + console output, and classifies the result
as `repros`, `does_not_repro`, or `cannot_determine`. A
`does_not_repro` result is surfaced as a `Resolution: Can't Repro`
candidate only — never auto-applied — so the human still decides based
on the evidence.

Explicitly documents what validation is not for: feature requests,
reports with a documented root cause + diff, perf regressions,
OS-specific behavior, and assistive-tech interactions — headless
doesn't give reliable signal there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flips the validation flow so the skill takes the initiative: during
Step 2 classification it now decides a `validation_candidate` boolean
per issue, and Step 3 presents the proposed validation set with
one-line reasons for each. The user confirms (yes / all / subset /
skip) rather than having to think to ask.

Moves the "when to validate vs not" heuristic into Step 3 where the
candidate decision is made, next to the examples the user will be
looking at. Step 3.5 is reframed as the execution of a
human-confirmed set, not an opt-in on user request.

Keeps the approval gate in Step 4 unchanged — validation produces
evidence only, never a mutation. Users can still manually request
additional validation after seeing the table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Investigated a real Copilot session on microsoft#35874 that failed visual
validation for three reasons: (1) it tried the workspace-wide
Storybook (public-docsite-v9) and hit HMR restart loops + missing
unstable package errors, (2) it used a project name alias that may
not exist in older workspace snapshots, (3) it fell back to guessing
ports because the dev target is unexpectedly declared with
`cache: true` which replays cached output and exits.

This commit:

- Forbids the workspace-wide Storybook for validation, explicitly.
  The per-component stories package is the only reliable path.
- Switches the primary command to `react-<component>-stories:storybook`
  (direct target on the stories project) with `--skip-nx-cache`, so
  the advice works even in workspace snapshots that predate the
  library-level `start` alias.
- Replaces the port-guessing loop with a proper detection pattern:
  find the storybook child PID (the nx wrapper often exits 0 after
  delegating) and read its listening socket via lsof.
- Adds a troubleshooting section mapping the three failure modes the
  Copilot session hit to their real causes.

The triage-issues validation step (which delegates to this skill)
is updated to reinforce the per-component-only rule inline, so an
agent that reads only the triage skill still gets the warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes found while validating PR microsoft#36013 against a fresh-ish workspace:

1. **Port detection.** The old heuristic picked the lowest-numbered
   listening port on the first process whose name contained
   "storybook dev", which was wrong in two ways:
   - `pgrep -f "storybook dev"` matched both `yarn storybook dev`
     (wrapper, no sockets) and the real `node .bin/storybook dev`
     child. `head -1` could pick the wrapper, leaving the port
     search empty forever.
   - Storybook opens two listening sockets (HTTP + HMR event-stream).
     They are not ordered — the event-stream is sometimes lower,
     sometimes higher. Grabbing the lower port gave back the
     event-stream in about half of attempts, producing a long hang
     when the caller then tried to navigate there.

   Replaced with: pgrep pattern scoped to `node.*\.bin/storybook dev`,
   then probe each listening socket and pick the one whose response
   Content-Type is text/html. Added an additional wait loop for
   `index.json` to populate so the caller doesn't race story compilation.

2. **Unstable-deps setup gotcha.** Per-component Storybooks fail to
   compile with `Module not found: @fluentui/react-alert` (and
   react-infobutton / react-virtualizer) on a fresh clone, because
   those three packages are workspace-linked from local sources whose
   `lib-commonjs/` is only produced on build. Documented the one-shot
   `yarn nx run-many -t build -p react-alert,react-infobutton,react-virtualizer`
   fix in both the visual-test troubleshooting section and the
   top-level contributing doc, so humans and agents don't rediscover
   it independently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a new agent skill that triages items on the org-level
GitHub Project at microsoft/projects/395 ("Fluent UI - Unified").

The skill is deliberately distinct from the existing `triage-issues`
skill — that one handles repo-level Shield triage (labels + area
owner on `Needs: Triage 🔍`), while this one operates at the
project-board layer: it sets the `Team` single-select field on
board items and, when CODEOWNERS names a specific user, adds that
user as a GitHub-issue assignee. Neither skill touches the other's
surface.

Works cross-repo against microsoft/fluentui,
microsoft/fluentui-system-icons, and microsoft/fluentui-contrib.
CODEOWNERS is the source of truth for both team and individual
routing; the skill caches the three CODEOWNERS files per session
and maps team handles (@microsoft/cxe-prg, @microsoft/teams-prg,
@microsoft/fui-wc, etc.) to the board's fixed Team options via a
confident-vs-ambiguous mapping table in references/team-mapping.md.
Ambiguous handles (charting-team, northstar, etc.) are flagged for
human review rather than auto-routed.

Codifies two rules learned from the first real run against the
live board:

1. **View 6 mirroring.** The board's canonical "By team" triage
   view (view 6) excludes items labeled `Resolution: Soft Close`,
   `Type: Epic`, `Help Wanted ✨`, and `Needs: Triage 🔍`, plus
   Status=Done, PRs, and closed state. The skill filters the same
   set client-side so it only proposes team assignments for items
   the human is actually looking at. Without this, soft-closed
   stale v8 bugs got into the triage pool on the first run — they
   shouldn't.

2. **v9 never routes to cxe-red.** cxe-red owns v8; v9 ownership
   is cxe-prg (or teams-prg for specific packages). When
   CODEOWNERS resolves to cxe-red but the issue carries the v9
   label, the skill reroutes to the next mapped team (or cxe-prg)
   and flags for human confirmation.

Also adds a two-part preflight at the top of the workflow: checks
both account permission on the repos (EMU vs non-EMU) and the
`project` OAuth scope on the active token (read:project is not
enough for `updateProjectV2ItemFieldValue`). Each failure mode has
its own surfaced message so the user can fix it without trial and
error.

Field and option IDs for the Team single-select are documented in
references/team-mapping.md as of this commit; a refresh query is
included so the skill can re-discover them if they ever rotate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tudorpopams tudorpopams changed the title feat: add triage-issues agent skill for Shield workflow feat: add triage-issues + triage-board skills (+ visual-test hardening) Apr 24, 2026
@tudorpopams tudorpopams marked this pull request as ready for review April 24, 2026 15:01
@tudorpopams tudorpopams requested review from a team as code owners April 24, 2026 15:01
@tudorpopams tudorpopams enabled auto-merge (squash) April 24, 2026 15:02
Adds a Phase 6 to the review-pr skill that posts the same markdown
produced in Phase 5 as a comment on the PR itself, so reviews are
visible to maintainers and — when the PR author is copilot-swe-agent
— become actionable feedback the agent can pick up.

The chat output and the PR comment are required to be identical; the
skill explicitly forbids paraphrasing or re-summarizing between the
two. A `Posted via the /review-pr skill.` trailer tags comments from
this skill so subsequent runs can detect and skip duplicates on the
same head SHA.

Bakes in two guardrails learned during April 2026 sessions:

- Pre-check the `gh` viewer identity + `viewerPermission` on the
  target repo. EMU (Enterprise Managed User) tokens on this org read
  fine but silently fail on writes; catching this at review-post
  time rather than confusing the user with a late error.
- Handle REQUEST_CHANGES correctly: a comment alone is advisory, so
  the skill now notes that a formal `gh pr review --request-changes`
  is still needed to block merge.

Also documents the don't-post cases (explicit review-only request,
iterating draft PRs, previously-posted on same head SHA) so the
skill doesn't become a source of comment noise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant