fix(handlers)!: emit lines_of_code.count as session delta, add .total gauge#44
Conversation
… gauge opencode publishes session.diff with the cumulative session diff (first snapshot -> latest), not a per-event delta. The previous handler added fileDiff.additions/deletions on every event, so the counter accumulated cumulative totals and over-reported lines on any multi-message session. The handler now tracks the last-seen cumulative per session and emits only the positive delta to opencode.lines_of_code.count. Summing the counter across a session yields the true session net (no double counting). Negative deltas (revert-to-baseline) are skipped. A new opencode.lines_of_code.total gauge records the current cumulative total for the session, refreshed on every session.diff and dropping back to 0 if opencode reports a revert. Use it for "what has this session produced so far" panels; use the counter for "rate of lines added over time". Cleanup of the new sessionDiffTotals map is wired into session.idle and session.error, mirroring the existing sessionTotals lifecycle. Closes DEVtheOPS#43 BREAKING CHANGE: opencode.lines_of_code.count semantics have changed. Dashboards that sum() the counter previously saw inflated numbers; they will now see the correct net session totals. Existing queries do not need to change, but the numeric results will be smaller (and correct).
📝 WalkthroughWalkthroughThe PR fixes metric overcounting by tracking per-session cumulative line diffs and emitting only the delta to the counter, while adding a new gauge for current cumulative totals; session diff state is wired into context and cleaned on idle/error. ChangesSession Diff Delta Tracking
Sequence DiagramsequenceDiagram
participant Bus
participant Handler
participant SessionState
participant Instruments
Bus->>Handler: session.diff (cumulative)
Handler->>SessionState: lookup prev totals
Handler->>Handler: compute delta = current - prev
Handler->>Instruments: linesCounter.add(delta > 0)
Handler->>Instruments: linesTotalGauge.record(current)
Handler->>SessionState: set updated totals
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~23 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/handlers/session.ts (1)
148-164:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRegression:
sessionTotalsdeleted before reading for error span'sAGENT_NAMEattribute.Lines 149–150 delete
ctx.sessionTotalsandctx.sessionDiffTotalsforrawID, but line 157 then attemptsctx.sessionTotals.get(rawID)to populate theAGENT_NAMEattribute on the closing session span. After the delete,totalsis alwaysundefined, soAGENT_NAMEis not set on error-terminated session spans. This is an unintended side effect of mirroringsessionDiffTotalscleanup in the lines_of_code tracking PR.Capture
sessionTotalsbefore deletion:🔧 Proposed fix
const error = errorSummary(e.properties.error) + const totals = rawID ? ctx.sessionTotals.get(rawID) : undefined if (rawID) { ctx.sessionTotals.delete(rawID) ctx.sessionDiffTotals.delete(rawID) } sweepSession(sessionID, ctx) if (rawID) { const sessionSpan = ctx.sessionSpans.get(rawID) if (sessionSpan) { - const totals = ctx.sessionTotals.get(rawID) if (totals) sessionSpan.setAttribute(AGENT_NAME, totals.agent)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/handlers/session.ts` around lines 148 - 164, The code deletes ctx.sessionTotals (and ctx.sessionDiffTotals) before reading totals for the error span, so totals is always undefined; to fix, read/capture const totals = ctx.sessionTotals.get(rawID) (and any needed diff totals) before calling ctx.sessionTotals.delete(rawID) / ctx.sessionDiffTotals.delete(rawID) or move those delete calls to after the sessionSpan handling; update the block around sweepSession, sessionSpan, and AGENT_NAME so sessionSpan.setAttribute(AGENT_NAME, totals.agent) uses the captured totals value rather than reading after deletion.
🧹 Nitpick comments (3)
src/otel.ts (1)
148-155: 💤 Low valueCounter description vs revert semantics — minor wording nit.
linesCounter's description says "sum over a session yields the net cumulative total". Strictly speaking, after a revert (e.g. cumulative goes 5 → 0 → 3), the handler skips the negative delta and emits another +3, so the counter sum (5 + 3 = 8) overstates the net (3). The gauge captures the true post-revert value. Consider tightening the wording to e.g. "sum yields the gross additions/removals across non-revert deltas; uselines_of_code.totalfor the live cumulative value", to set correct expectations for dashboards.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/otel.ts` around lines 148 - 155, The counter description for linesCounter is inaccurate about semantics after reverts; update the description passed to meter.createCounter (symbol: linesCounter) to clarify that its sum reflects gross additions/removals across non-revert deltas and may overstate net after reverts, and mention that linesTotalGauge (symbol: linesTotalGauge) should be used for the live cumulative value; keep the change to the description strings only in the meter.createCounter and meter.createGauge calls.README.md (1)
34-35: ⚡ Quick winDocument the revert limitation alongside the new semantics.
The PR objective explicitly calls out documenting "that rewrites reverted within a single message are not captured by session.diff". Two callouts would help users avoid surprises:
- The counter description ("summing the counter yields the true cumulative total without double-counting") is slightly optimistic — when opencode reports a revert (cumulative drops), the negative delta is skipped, so the counter sum can overstate the net. Recommend pointing readers at
lines_of_code.totalfor the authoritative live total.- Note the within-message revert limitation explicitly so dashboard users understand the floor on accuracy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 34 - 35, Update the metric docs for opencode.lines_of_code.count and opencode.lines_of_code.total to call out the revert limitation: change the counter description to note that when opencode reports a cumulative drop the negative delta is skipped (so summing opencode.lines_of_code.count can overstate net) and direct readers to opencode.lines_of_code.total as the authoritative live total; add an explicit sentence stating that rewrites reverted within a single session.diff message are not captured and therefore per-message reverts can cause transient inaccuracy in the counter.src/handlers/activity.ts (1)
25-28: 💤 Low valueConsider
setBoundedMapforsessionDiffTotalsto match thesessionTotalspattern.
sessionTotalsis populated viasetBoundedMap(seehandleSessionCreated) to bound memory if cleanup ever lags.sessionDiffTotals.sethere is the only writer and shares the same lifecycle, so applyingsetBoundedMapkeeps the codebase consistent and provides the same defensive bound. The coding guideline explicitly mandatessetBoundedMapforpendingToolSpans/pendingPermissions, but the project pattern has been extended to per-session totals — worth aligning here too.♻️ Proposed change
-import { isMetricEnabled } from "../util.ts" +import { isMetricEnabled, setBoundedMap } from "../util.ts" @@ - ctx.sessionDiffTotals.set(sessionID, { additions: totalAdded, deletions: totalRemoved }) + setBoundedMap(ctx.sessionDiffTotals, sessionID, { additions: totalAdded, deletions: totalRemoved })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/handlers/activity.ts` around lines 25 - 28, The sessionDiffTotals map should be managed with the same bounded helper used for sessionTotals to prevent unbounded growth; replace direct writes to sessionDiffTotals (the ctx.sessionDiffTotals.set call in the code handling sessionID and delta calculation) by initializing sessionDiffTotals via setBoundedMap where sessionTotals is initialized (see handleSessionCreated) and then use the bounded map instance for updates instead of raw set, ensuring the existing delta computation (prev = ctx.sessionDiffTotals.get(...), deltaAdded/deltaRemoved) remains unchanged but the underlying map is created through setBoundedMap so writes are subject to the same retention bounds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/handlers/session.ts`:
- Around line 148-164: The code deletes ctx.sessionTotals (and
ctx.sessionDiffTotals) before reading totals for the error span, so totals is
always undefined; to fix, read/capture const totals =
ctx.sessionTotals.get(rawID) (and any needed diff totals) before calling
ctx.sessionTotals.delete(rawID) / ctx.sessionDiffTotals.delete(rawID) or move
those delete calls to after the sessionSpan handling; update the block around
sweepSession, sessionSpan, and AGENT_NAME so
sessionSpan.setAttribute(AGENT_NAME, totals.agent) uses the captured totals
value rather than reading after deletion.
---
Nitpick comments:
In `@README.md`:
- Around line 34-35: Update the metric docs for opencode.lines_of_code.count and
opencode.lines_of_code.total to call out the revert limitation: change the
counter description to note that when opencode reports a cumulative drop the
negative delta is skipped (so summing opencode.lines_of_code.count can overstate
net) and direct readers to opencode.lines_of_code.total as the authoritative
live total; add an explicit sentence stating that rewrites reverted within a
single session.diff message are not captured and therefore per-message reverts
can cause transient inaccuracy in the counter.
In `@src/handlers/activity.ts`:
- Around line 25-28: The sessionDiffTotals map should be managed with the same
bounded helper used for sessionTotals to prevent unbounded growth; replace
direct writes to sessionDiffTotals (the ctx.sessionDiffTotals.set call in the
code handling sessionID and delta calculation) by initializing sessionDiffTotals
via setBoundedMap where sessionTotals is initialized (see handleSessionCreated)
and then use the bounded map instance for updates instead of raw set, ensuring
the existing delta computation (prev = ctx.sessionDiffTotals.get(...),
deltaAdded/deltaRemoved) remains unchanged but the underlying map is created
through setBoundedMap so writes are subject to the same retention bounds.
In `@src/otel.ts`:
- Around line 148-155: The counter description for linesCounter is inaccurate
about semantics after reverts; update the description passed to
meter.createCounter (symbol: linesCounter) to clarify that its sum reflects
gross additions/removals across non-revert deltas and may overstate net after
reverts, and mention that linesTotalGauge (symbol: linesTotalGauge) should be
used for the live cumulative value; keep the change to the description strings
only in the meter.createCounter and meter.createGauge calls.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1c00440b-7793-45d9-9362-e7920b81133c
📒 Files selected for processing (8)
README.mdsrc/handlers/activity.tssrc/handlers/session.tssrc/index.tssrc/otel.tssrc/types.tstests/handlers/activity.test.tstests/helpers.ts
- handleSessionError: read sessionTotals before deleting, so the AGENT_NAME attribute on the error span is actually populated (pre-existing bug the reviewer surfaced in this diff's adjacent block). - handleSessionDiff: write sessionDiffTotals through setBoundedMap, matching the repo convention for sessionTotals / sessionSpans / pendingToolSpans (MAX_PENDING = 500 cap against unbounded growth). - otel.ts / README.md: spell out the counter's "sums gross additions across non-revert deltas" semantics and direct readers to lines_of_code.total as the authoritative live cumulative; note that intra-message reverts are not captured.
|
I think there are two separate issues here. First,
This records So the repeated double-counting bug is fixed, but the metric is still not a "true net total emitted as deltas." A monotonic Second, Suggested code change for the bounded-map part: const nextTotals = { additions: totalAdded, deletions: totalRemoved }
if (ctx.sessionDiffTotals.has(sessionID)) {
ctx.sessionDiffTotals.set(sessionID, nextTotals)
} else {
setBoundedMap(ctx.sessionDiffTotals, sessionID, nextTotals)
}Before suggesting a concrete fix for the metric semantics, I want to check the intended behavior I’m assuming:
If it’s the first, I think the counter should emit rollback magnitudes as separate positive samples instead of dropping negative deltas. If it’s the second, the docs/comments/PR text should probably say that explicitly, because "true net total" is not what the current counter reports. I’d also suggest adding a regression test for a partial revert / mixed add-remove transition like |
- activity.ts: guard sessionDiffTotals writes so setBoundedMap only fires on insert; updates to an existing key now use plain .set(). Prevents evicting an unrelated active session when the map is at MAX_PENDING, which would cause that session's next session.diff to be treated as first-seen and re-emit its full cumulative diff as a delta — reintroducing the double-count bug for the evicted session. - docs: stop calling the counter "net". It is gross positive churn — negative deltas (additions/deletions shrinking, e.g. cumulative +10/-0 -> +5/-5) are dropped, so sums do not reconcile to net after any revert, full or partial. lines_of_code.total gauge is the authoritative live cumulative. Counter description, gauge description, handler docstring, and README all updated. - tests: add regression for the partial-revert case (+10/-0 -> +5/-5) that exercises the semantics gap the maintainer flagged — asserts counter emits gross (added=10, removed=5) and gauge tracks live cumulative (added=[10,5], removed=[0,5]).
|
Good catches on both, thanks for the careful read. Pushed 7f0802a addressing both points. On semantics — going with Option 2 (gross churn + gauge as authority). You're right that the PR description and docs were overclaiming. My original intent here was always "two separate signals: a delta counter and a live gauge" — the gauge was meant to be the authoritative answer to "what does this session currently look like," and the counter was a rate-of-churn feed that dashboards could sum or I considered your Option 1 (emit rollback magnitudes as separate positive samples with a distinct attribute so net is reconstructible from the counter alone) and decided against it for this PR — it's a materially different metric shape, adds a second attribute dimension that doubles cardinality on the counter, and the use case is already covered by the gauge. If there's demand for net-reconstructible-from-counter later I'd rather do that as a follow-up with its own design discussion. On Tests — added the partial-revert regression (
|
|
Hello! Thanks for having written and published this plugin, first I’ll try to make it short for all of our sanities: what’s the plan for the PR? I just encountered the same issue (#43), and really all I want is something that matches claude_code semantics so that the data is comparable across AI harnesses. So if re-focusing the PR (on just the cumulative vs. delta recording) will help merging faster and have it in 0.10.0, I’ll gladly make a branch that has only this fix! I’m not really blocked since I can just fork the plugin and use it locally by symlinking files, but it’d be cool to have the fix upstream too! Cheers, |
|
The latest commit does address the two implementation issues I called out:
One thing still looks out of sync, though: the PR description still says Related to I think it would help to clarify which of these is the intended merge target:
At minimum, I’d update the PR body before merge so it matches the current implementation and doesn’t keep claiming “net total” for |
|
Just updated the PR description to close out the open questions from review. @dialupdisaster: the description now leads with the merge target explicitly: option (2), gross-positive-churn counter + authoritative gauge. Option (1) is named and discarded up front, so it's no longer buried in the body. The "net total" framing is gone throughout. The two implementation fixes (read-before-delete on @gagbo: no need to spin a slimmer branch. The PR is in good shape and there's no intent to scope it down further. On the claude_code parity question: this PR intentionally does not match |
|
After doing some testing, this is looking good. Thanks for the contribution! |
Closes #43.
Merge target: of the two designs raised in review, this PR lands option (2) — opencode-specific semantics where `.count` is gross positive churn and the new `.total` gauge is the live cumulative. Option (1), reshaping `.count` itself to mean net, is intentionally not pursued. Rationale below.
Summary
Why two metrics with different semantics
opencode publishes `session.diff` events carrying the cumulative diff of the whole session (first snapshot → latest), not a per-event delta. The previous handler added `fileDiff.additions`/`deletions` to the counter on every event, so an N-message session would count the first N−1 cumulative totals redundantly on top of the real total.
Concrete repro (full detail in #43):
Naively switching the counter to "emit deltas" only fixes the additive-cumulative bug. It does not make the counter reconcile to net after a revert: a monotonic `Counter` cannot represent decreases, and emitting the rollback magnitude as a separate positive sample would double cardinality and require dashboard math to reconstruct net (h/t @dialupdisaster for surfacing this).
So this PR splits the two questions into two metrics:
This is an intentional divergence from a hypothetical "single counter that means net" — that single-metric model isn't expressible with monotonic counters, and we'd rather give dashboards two clear signals than one ambiguous one.
On `claude_code.lines_of_code.count` parity
@gagbo asked about cross-harness comparability with `claude_code.lines_of_code.count`. To be direct about the choice this PR makes: we are intentionally not matching `claude_code.lines_of_code.count` semantics. opencode's `.count` here means gross positive churn — a different contract. For cross-harness comparison, use `lines_of_code.total` (the gauge), which is a live cumulative and the more portable comparison point. If a future PR wants to add a separate metric explicitly modeled on Claude Code's contract, that's a clean follow-up; this PR does not try to overload `.count` to serve both.
Impact on existing dashboards
Silent behaviour change, not a silent API change. The metric's name, type, unit, and attributes are unchanged, so collectors keep scraping and dashboards keep rendering — the numbers just change.
Recommended action before upgrading:
Notes on the new gauge
Known limitation (carried over from #43)
Even with this fix, both metrics only see what opencode publishes via `session.diff` — opencode recomputes the summary against the session baseline, not step-by-step, so rewrites that happen and get reverted within a single message don't generate events and aren't visible to either metric. Capturing intra-message churn would require hooking `tool.execute.after` and is out of scope here.
Other changes worth noting
Tests
Added cases to `tests/handlers/activity.test.ts` covering:
Also fixed a `setBoundedMap` eviction bug in an earlier draft (calling it on every `.set` could evict an unrelated active session, and that session's next `session.diff` would be treated as first-seen — reintroducing the cumulative double-count). Updates to existing keys now go through plain `.set()`; `setBoundedMap` only fires on insert. Thanks @dialupdisaster.
Local verification: