Skip to content

fix(handlers)!: emit lines_of_code.count as session delta, add .total gauge#44

Merged
dialupdisaster merged 3 commits into
DEVtheOPS:mainfrom
Jeevan1351:fix-lines-of-code-net-count
May 18, 2026
Merged

fix(handlers)!: emit lines_of_code.count as session delta, add .total gauge#44
dialupdisaster merged 3 commits into
DEVtheOPS:mainfrom
Jeevan1351:fix-lines-of-code-net-count

Conversation

@Jeevan1351
Copy link
Copy Markdown
Contributor

@Jeevan1351 Jeevan1351 commented May 6, 2026

Closes #43.

Merge target: of the two designs raised in review, this PR lands option (2) — opencode-specific semantics where `.count` is gross positive churn and the new `.total` gauge is the live cumulative. Option (1), reshaping `.count` itself to mean net, is intentionally not pursued. Rationale below.

Summary

  • Breaking: `opencode.lines_of_code.count` is redefined from "(broken) attempt at cumulative net" to gross positive churn across forward transitions. Existing dashboards summing `.count` will return different numbers — see Impact on existing dashboards below.
  • Fixes the over-counting bug behind that redefinition: the counter now emits only the positive delta since the previous `session.diff` for a given session, instead of re-adding the cumulative total on every event. Negative deltas (opencode reporting a smaller cumulative than last seen — full or partial reverts) are dropped.
  • Adds `opencode.lines_of_code.total` (Gauge), which mirrors opencode's current cumulative `additions`/`deletions` on every `session.diff`. This is the authoritative live view of the session's current state and the metric to query for "what does this session amount to right now."

Why two metrics with different semantics

opencode publishes `session.diff` events carrying the cumulative diff of the whole session (first snapshot → latest), not a per-event delta. The previous handler added `fileDiff.additions`/`deletions` to the counter on every event, so an N-message session would count the first N−1 cumulative totals redundantly on top of the real total.

Concrete repro (full detail in #43):

Event `additions` in cum. diff old counter `.add()` old counter total Real session total
msg 1 1 +1 1 1
msg 2 1 +1 2 1
msg 3 1 +1 3 1
msg 4 0 (revert) +0 3 0
msg 5 1 +1 4 1
msg 6 1 +1 5 1

Naively switching the counter to "emit deltas" only fixes the additive-cumulative bug. It does not make the counter reconcile to net after a revert: a monotonic `Counter` cannot represent decreases, and emitting the rollback magnitude as a separate positive sample would double cardinality and require dashboard math to reconstruct net (h/t @dialupdisaster for surfacing this).

So this PR splits the two questions into two metrics:

  • `lines_of_code.count` (Counter) — gross positive churn across forward transitions. Sums and `rate()`s answer "how much line-changing work has this session produced over time." Negative deltas are dropped, so `sum()` does not reconcile to net after any revert (full or partial).
  • `lines_of_code.total` (Gauge) — authoritative live cumulative. Refreshed on every `session.diff` to opencode's current cumulative value. Tracks partial and full reverts faithfully. Drops to `0` if opencode reports a revert to baseline. Query this for "what does this session currently amount to."

This is an intentional divergence from a hypothetical "single counter that means net" — that single-metric model isn't expressible with monotonic counters, and we'd rather give dashboards two clear signals than one ambiguous one.

On `claude_code.lines_of_code.count` parity

@gagbo asked about cross-harness comparability with `claude_code.lines_of_code.count`. To be direct about the choice this PR makes: we are intentionally not matching `claude_code.lines_of_code.count` semantics. opencode's `.count` here means gross positive churn — a different contract. For cross-harness comparison, use `lines_of_code.total` (the gauge), which is a live cumulative and the more portable comparison point. If a future PR wants to add a separate metric explicitly modeled on Claude Code's contract, that's a clean follow-up; this PR does not try to overload `.count` to serve both.

Impact on existing dashboards

Silent behaviour change, not a silent API change. The metric's name, type, unit, and attributes are unchanged, so collectors keep scraping and dashboards keep rendering — the numbers just change.

Query Before After
`sum(opencode_lines_of_code_count_total)` inflated (scaled with msg count) gross positive churn (correct, but not session net after reverts)
`rate(opencode_lines_of_code_count_total[5m])` inflated rate of gross positive churn
`increase(opencode_lines_of_code_count_total[1h])` inflated gross positive churn over the window
"current session size" not directly available use `opencode_lines_of_code_total` (new gauge)
Alerts tuned on the old `.count` series may fire when they shouldn't may go quiet — re-tune

Recommended action before upgrading:

  1. Decide what each dashboard panel is asking. If it's "rate of line-changing activity over time," keep `.count`. If it's "what is the current state of this session," switch to `.total`.
  2. Re-tune alerts on `.count` — expect numbers roughly 1/N where N is the average messages per session, and expect them not to reflect reverts.
  3. There is no clean way to reconcile pre/post-upgrade samples within a single time series for `.count`. Consider dropping pre-upgrade data or recording-rule-renaming the new series if you need a hard cut.

Notes on the new gauge

  • Emits both `type=added` and `type=removed` on every `session.diff`, even when values are 0. That's intentional: a gauge needs to reflect "reverted to baseline" explicitly.
  • Because it can drop to 0 mid-session, panels that track "cumulative work done" may see what looks like an outage. If that matters, prefer `sum by (session.id) (opencode_lines_of_code_count_total)` for monotonic "work done."
  • Opt out via the existing `OPENCODE_DISABLE_METRICS` env var: `OPENCODE_DISABLE_METRICS=lines_of_code.total`.

Known limitation (carried over from #43)

Even with this fix, both metrics only see what opencode publishes via `session.diff` — opencode recomputes the summary against the session baseline, not step-by-step, so rewrites that happen and get reverted within a single message don't generate events and aren't visible to either metric. Capturing intra-message churn would require hooking `tool.execute.after` and is out of scope here.

Other changes worth noting

  • Drive-by fix in `handleSessionError`. Captured `sessionTotals.get(rawID)` before deleting it, so the `AGENT_NAME` attribute is correctly populated on error-terminated session spans. Pre-existing latent bug on `main` (`.get()` ran after `.delete()` and always returned `undefined`); surfaced when the new `sessionDiffTotals.delete` was added in the same block. Caught by CodeRabbit during review — thanks.

Tests

Added cases to `tests/handlers/activity.test.ts` covering:

  • Cumulative `4 → 9 → 9 → 11` produces three counter samples (`4`, `5`, `2`, summing to `11`); the `9 → 9` event yields a zero delta and is dropped, so the counter does not record `4 + 9 + 9 + 11 = 33`.
  • Negative deltas (full revert-to-baseline) skip the counter.
  • Partial revert / mixed transition `+10/-0 → +5/-5` — counter ends at `added=10, removed=5` (gross), gauge tracks `added=[10, 5]`, `removed=[0, 5]` (live cumulative). Pins down the semantics gap @dialupdisaster called out.
  • Gauge records cumulative totals on every event, including `0` after revert.
  • Per-session isolation — one session's deltas don't affect another's tracking.
  • `tests/helpers.ts` updated: new `SpyGauge` instrument spy, `linesTotalGauge` wired into `Instruments`, `sessionDiffTotals: new Map()` added to the `HandlerContext` fixture.

Also fixed a `setBoundedMap` eviction bug in an earlier draft (calling it on every `.set` could evict an unrelated active session, and that session's next `session.diff` would be treated as first-seen — reintroducing the cumulative double-count). Updates to existing keys now go through plain `.set()`; `setBoundedMap` only fires on insert. Thanks @dialupdisaster.

Local verification:

  • `bun run lint` — clean
  • `bun run typecheck` — clean
  • `bun test` — 244 pass / 1 fail. The one failure is `probeEndpoint > returns error for malformed URL (no scheme)` in `tests/probe.test.ts`, unrelated to this PR — reproduced on a clean clone of `main` at `a2ac3e5` (no files in `src/probe.ts` are touched here).

… gauge

opencode publishes session.diff with the cumulative session diff (first
snapshot -> latest), not a per-event delta. The previous handler added
fileDiff.additions/deletions on every event, so the counter accumulated
cumulative totals and over-reported lines on any multi-message session.

The handler now tracks the last-seen cumulative per session and emits
only the positive delta to opencode.lines_of_code.count. Summing the
counter across a session yields the true session net (no double
counting). Negative deltas (revert-to-baseline) are skipped.

A new opencode.lines_of_code.total gauge records the current cumulative
total for the session, refreshed on every session.diff and dropping
back to 0 if opencode reports a revert. Use it for "what has this
session produced so far" panels; use the counter for "rate of lines
added over time".

Cleanup of the new sessionDiffTotals map is wired into session.idle
and session.error, mirroring the existing sessionTotals lifecycle.

Closes DEVtheOPS#43

BREAKING CHANGE: opencode.lines_of_code.count semantics have changed.
Dashboards that sum() the counter previously saw inflated numbers; they
will now see the correct net session totals. Existing queries do not
need to change, but the numeric results will be smaller (and correct).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

📝 Walkthrough

Walkthrough

The PR fixes metric overcounting by tracking per-session cumulative line diffs and emitting only the delta to the counter, while adding a new gauge for current cumulative totals; session diff state is wired into context and cleaned on idle/error.

Changes

Session Diff Delta Tracking

Layer / File(s) Summary
Type Definitions
src/types.ts
Instruments extended with linesTotalGauge: Gauge; HandlerContext extended with sessionDiffTotals: Map<string, { additions: number; deletions: number }> (imports updated).
Instrumentation Setup
src/otel.ts
New lines_of_code.total gauge added; lines_of_code.count counter description updated to document gross-positive-delta semantics.
Context Initialization
src/index.ts
sessionDiffTotals Map instantiated and wired into the plugin HandlerContext.
Activity Handler (imports & docs)
src/handlers/activity.ts
Imports updated (setBoundedMap), docblock added describing new delta/gauge semantics.
Core Delta Computation
src/handlers/activity.ts
handleSessionDiff computes cumulative totals from event diffs, calculates deltas against prior snapshot in sessionDiffTotals, emits positive deltas to linesCounter, records cumulative totals to linesTotalGauge, and updates sessionDiffTotals using setBoundedMap when first-seen.
State Cleanup
src/handlers/session.ts
sessionDiffTotals entry deleted when session becomes idle or encounters an error, alongside sessionTotals cleanup.
Tests & Helpers
tests/helpers.ts, tests/handlers/activity.test.ts
Test helpers add SpyGauge and makeGauge(); mocks include linesTotalGauge and sessionDiffTotals; new tests cover repeated diffs, revert behavior, cumulative gauge semantics, and per-session isolation.
Documentation
README.md
Metrics appendix updated: new opencode.lines_of_code.total entry, revised opencode.lines_of_code.count description, and an example to disable the new gauge.

Sequence Diagram

sequenceDiagram
  participant Bus
  participant Handler
  participant SessionState
  participant Instruments
  Bus->>Handler: session.diff (cumulative)
  Handler->>SessionState: lookup prev totals
  Handler->>Handler: compute delta = current - prev
  Handler->>Instruments: linesCounter.add(delta > 0)
  Handler->>Instruments: linesTotalGauge.record(current)
  Handler->>SessionState: set updated totals
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~23 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: fixing over-counting in lines_of_code.count by emitting session deltas and adding a new .total gauge metric.
Linked Issues check ✅ Passed All primary objectives from issue #43 are met: track per-session cumulative diff, emit only positive deltas to counter, add lines_of_code.total gauge, cleanup on session end/error, and provide breaking-change documentation.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to issue #43 requirements: handler logic for delta tracking, type extensions for sessionDiffTotals and linesTotalGauge, documentation updates, and comprehensive test coverage for the new metrics behavior.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/handlers/session.ts (1)

148-164: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Regression: sessionTotals deleted before reading for error span's AGENT_NAME attribute.

Lines 149–150 delete ctx.sessionTotals and ctx.sessionDiffTotals for rawID, but line 157 then attempts ctx.sessionTotals.get(rawID) to populate the AGENT_NAME attribute on the closing session span. After the delete, totals is always undefined, so AGENT_NAME is not set on error-terminated session spans. This is an unintended side effect of mirroring sessionDiffTotals cleanup in the lines_of_code tracking PR.

Capture sessionTotals before deletion:

🔧 Proposed fix
   const error = errorSummary(e.properties.error)
+  const totals = rawID ? ctx.sessionTotals.get(rawID) : undefined
    if (rawID) {
      ctx.sessionTotals.delete(rawID)
      ctx.sessionDiffTotals.delete(rawID)
    }
    sweepSession(sessionID, ctx)

    if (rawID) {
      const sessionSpan = ctx.sessionSpans.get(rawID)
      if (sessionSpan) {
-       const totals = ctx.sessionTotals.get(rawID)
        if (totals) sessionSpan.setAttribute(AGENT_NAME, totals.agent)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/handlers/session.ts` around lines 148 - 164, The code deletes
ctx.sessionTotals (and ctx.sessionDiffTotals) before reading totals for the
error span, so totals is always undefined; to fix, read/capture const totals =
ctx.sessionTotals.get(rawID) (and any needed diff totals) before calling
ctx.sessionTotals.delete(rawID) / ctx.sessionDiffTotals.delete(rawID) or move
those delete calls to after the sessionSpan handling; update the block around
sweepSession, sessionSpan, and AGENT_NAME so
sessionSpan.setAttribute(AGENT_NAME, totals.agent) uses the captured totals
value rather than reading after deletion.
🧹 Nitpick comments (3)
src/otel.ts (1)

148-155: 💤 Low value

Counter description vs revert semantics — minor wording nit.

linesCounter's description says "sum over a session yields the net cumulative total". Strictly speaking, after a revert (e.g. cumulative goes 5 → 0 → 3), the handler skips the negative delta and emits another +3, so the counter sum (5 + 3 = 8) overstates the net (3). The gauge captures the true post-revert value. Consider tightening the wording to e.g. "sum yields the gross additions/removals across non-revert deltas; use lines_of_code.total for the live cumulative value", to set correct expectations for dashboards.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/otel.ts` around lines 148 - 155, The counter description for linesCounter
is inaccurate about semantics after reverts; update the description passed to
meter.createCounter (symbol: linesCounter) to clarify that its sum reflects
gross additions/removals across non-revert deltas and may overstate net after
reverts, and mention that linesTotalGauge (symbol: linesTotalGauge) should be
used for the live cumulative value; keep the change to the description strings
only in the meter.createCounter and meter.createGauge calls.
README.md (1)

34-35: ⚡ Quick win

Document the revert limitation alongside the new semantics.

The PR objective explicitly calls out documenting "that rewrites reverted within a single message are not captured by session.diff". Two callouts would help users avoid surprises:

  1. The counter description ("summing the counter yields the true cumulative total without double-counting") is slightly optimistic — when opencode reports a revert (cumulative drops), the negative delta is skipped, so the counter sum can overstate the net. Recommend pointing readers at lines_of_code.total for the authoritative live total.
  2. Note the within-message revert limitation explicitly so dashboard users understand the floor on accuracy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 34 - 35, Update the metric docs for
opencode.lines_of_code.count and opencode.lines_of_code.total to call out the
revert limitation: change the counter description to note that when opencode
reports a cumulative drop the negative delta is skipped (so summing
opencode.lines_of_code.count can overstate net) and direct readers to
opencode.lines_of_code.total as the authoritative live total; add an explicit
sentence stating that rewrites reverted within a single session.diff message are
not captured and therefore per-message reverts can cause transient inaccuracy in
the counter.
src/handlers/activity.ts (1)

25-28: 💤 Low value

Consider setBoundedMap for sessionDiffTotals to match the sessionTotals pattern.

sessionTotals is populated via setBoundedMap (see handleSessionCreated) to bound memory if cleanup ever lags. sessionDiffTotals.set here is the only writer and shares the same lifecycle, so applying setBoundedMap keeps the codebase consistent and provides the same defensive bound. The coding guideline explicitly mandates setBoundedMap for pendingToolSpans/pendingPermissions, but the project pattern has been extended to per-session totals — worth aligning here too.

♻️ Proposed change
-import { isMetricEnabled } from "../util.ts"
+import { isMetricEnabled, setBoundedMap } from "../util.ts"
@@
-  ctx.sessionDiffTotals.set(sessionID, { additions: totalAdded, deletions: totalRemoved })
+  setBoundedMap(ctx.sessionDiffTotals, sessionID, { additions: totalAdded, deletions: totalRemoved })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/handlers/activity.ts` around lines 25 - 28, The sessionDiffTotals map
should be managed with the same bounded helper used for sessionTotals to prevent
unbounded growth; replace direct writes to sessionDiffTotals (the
ctx.sessionDiffTotals.set call in the code handling sessionID and delta
calculation) by initializing sessionDiffTotals via setBoundedMap where
sessionTotals is initialized (see handleSessionCreated) and then use the bounded
map instance for updates instead of raw set, ensuring the existing delta
computation (prev = ctx.sessionDiffTotals.get(...), deltaAdded/deltaRemoved)
remains unchanged but the underlying map is created through setBoundedMap so
writes are subject to the same retention bounds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/handlers/session.ts`:
- Around line 148-164: The code deletes ctx.sessionTotals (and
ctx.sessionDiffTotals) before reading totals for the error span, so totals is
always undefined; to fix, read/capture const totals =
ctx.sessionTotals.get(rawID) (and any needed diff totals) before calling
ctx.sessionTotals.delete(rawID) / ctx.sessionDiffTotals.delete(rawID) or move
those delete calls to after the sessionSpan handling; update the block around
sweepSession, sessionSpan, and AGENT_NAME so
sessionSpan.setAttribute(AGENT_NAME, totals.agent) uses the captured totals
value rather than reading after deletion.

---

Nitpick comments:
In `@README.md`:
- Around line 34-35: Update the metric docs for opencode.lines_of_code.count and
opencode.lines_of_code.total to call out the revert limitation: change the
counter description to note that when opencode reports a cumulative drop the
negative delta is skipped (so summing opencode.lines_of_code.count can overstate
net) and direct readers to opencode.lines_of_code.total as the authoritative
live total; add an explicit sentence stating that rewrites reverted within a
single session.diff message are not captured and therefore per-message reverts
can cause transient inaccuracy in the counter.

In `@src/handlers/activity.ts`:
- Around line 25-28: The sessionDiffTotals map should be managed with the same
bounded helper used for sessionTotals to prevent unbounded growth; replace
direct writes to sessionDiffTotals (the ctx.sessionDiffTotals.set call in the
code handling sessionID and delta calculation) by initializing sessionDiffTotals
via setBoundedMap where sessionTotals is initialized (see handleSessionCreated)
and then use the bounded map instance for updates instead of raw set, ensuring
the existing delta computation (prev = ctx.sessionDiffTotals.get(...),
deltaAdded/deltaRemoved) remains unchanged but the underlying map is created
through setBoundedMap so writes are subject to the same retention bounds.

In `@src/otel.ts`:
- Around line 148-155: The counter description for linesCounter is inaccurate
about semantics after reverts; update the description passed to
meter.createCounter (symbol: linesCounter) to clarify that its sum reflects
gross additions/removals across non-revert deltas and may overstate net after
reverts, and mention that linesTotalGauge (symbol: linesTotalGauge) should be
used for the live cumulative value; keep the change to the description strings
only in the meter.createCounter and meter.createGauge calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1c00440b-7793-45d9-9362-e7920b81133c

📥 Commits

Reviewing files that changed from the base of the PR and between a2ac3e5 and 9eaefc7.

📒 Files selected for processing (8)
  • README.md
  • src/handlers/activity.ts
  • src/handlers/session.ts
  • src/index.ts
  • src/otel.ts
  • src/types.ts
  • tests/handlers/activity.test.ts
  • tests/helpers.ts

- handleSessionError: read sessionTotals before deleting, so the AGENT_NAME
  attribute on the error span is actually populated (pre-existing bug the
  reviewer surfaced in this diff's adjacent block).
- handleSessionDiff: write sessionDiffTotals through setBoundedMap, matching
  the repo convention for sessionTotals / sessionSpans / pendingToolSpans
  (MAX_PENDING = 500 cap against unbounded growth).
- otel.ts / README.md: spell out the counter's "sums gross additions across
  non-revert deltas" semantics and direct readers to lines_of_code.total as
  the authoritative live cumulative; note that intra-message reverts are
  not captured.
@dialupdisaster
Copy link
Copy Markdown
Contributor

I think there are two separate issues here.

First, lines_of_code.count still does not match the PR description in cases where the cumulative diff shrinks without going all the way back to zero. Right now negative deltas are dropped:

  • event 1: additions=10,deletions=0
  • event 2: additions=5,deletions=5

This records +10 added and +5 removed, while the current cumulative session diff is only +5/-5.

So the repeated double-counting bug is fixed, but the metric is still not a "true net total emitted as deltas." A monotonic Counter cannot represent decreases unless rollback magnitudes are emitted separately and then subtracted in queries. Otherwise the new lines_of_code.total gauge is the only authoritative current total.

Second, sessionDiffTotals is being updated with setBoundedMap(). Because that helper evicts whenever map.size >= MAX_PENDING, it can evict an unrelated active session even when we are only updating an existing key. After eviction, the next diff for that session is treated as first-seen again and its full cumulative total gets counted again.

Suggested code change for the bounded-map part:

const nextTotals = { additions: totalAdded, deletions: totalRemoved }
if (ctx.sessionDiffTotals.has(sessionID)) {
  ctx.sessionDiffTotals.set(sessionID, nextTotals)
} else {
  setBoundedMap(ctx.sessionDiffTotals, sessionID, nextTotals)
}

Before suggesting a concrete fix for the metric semantics, I want to check the intended behavior I’m assuming:

  1. Is the goal for lines_of_code.count to remain queryable as a monotonic series that can be reconstructed back to net with dashboard math?
  2. Or is the goal for lines_of_code.count to mean "gross positive churn only," with lines_of_code.total as the authoritative live cumulative value?

If it’s the first, I think the counter should emit rollback magnitudes as separate positive samples instead of dropping negative deltas. If it’s the second, the docs/comments/PR text should probably say that explicitly, because "true net total" is not what the current counter reports.

I’d also suggest adding a regression test for a partial revert / mixed add-remove transition like +10/-0 -> +5/-5, since that’s the case that currently exposes the semantics gap.

- activity.ts: guard sessionDiffTotals writes so setBoundedMap only fires on
  insert; updates to an existing key now use plain .set(). Prevents evicting an
  unrelated active session when the map is at MAX_PENDING, which would cause
  that session's next session.diff to be treated as first-seen and re-emit its
  full cumulative diff as a delta — reintroducing the double-count bug for the
  evicted session.
- docs: stop calling the counter "net". It is gross positive churn — negative
  deltas (additions/deletions shrinking, e.g. cumulative +10/-0 -> +5/-5) are
  dropped, so sums do not reconcile to net after any revert, full or partial.
  lines_of_code.total gauge is the authoritative live cumulative. Counter
  description, gauge description, handler docstring, and README all updated.
- tests: add regression for the partial-revert case (+10/-0 -> +5/-5) that
  exercises the semantics gap the maintainer flagged — asserts counter emits
  gross (added=10, removed=5) and gauge tracks live cumulative (added=[10,5],
  removed=[0,5]).
@Jeevan1351
Copy link
Copy Markdown
Contributor Author

Good catches on both, thanks for the careful read.

Pushed 7f0802a addressing both points.

On semantics — going with Option 2 (gross churn + gauge as authority).

You're right that the PR description and docs were overclaiming. My original intent here was always "two separate signals: a delta counter and a live gauge" — the gauge was meant to be the authoritative answer to "what does this session currently look like," and the counter was a rate-of-churn feed that dashboards could sum or rate() over. The "true net total" wording drifted in during the writeup and doesn't match the code. I've rewritten the counter description, the gauge description, the handler docstring, and the README metrics row to say this plainly: the counter is gross positive churn across forward transitions, negative deltas (including the partial +10/-0 → +5/-5 case you called out) are dropped, and it does not reconcile to net after any revert. lines_of_code.total is the single source of truth for live cumulative.

I considered your Option 1 (emit rollback magnitudes as separate positive samples with a distinct attribute so net is reconstructible from the counter alone) and decided against it for this PR — it's a materially different metric shape, adds a second attribute dimension that doubles cardinality on the counter, and the use case is already covered by the gauge. If there's demand for net-reconstructible-from-counter later I'd rather do that as a follow-up with its own design discussion.

On setBoundedMap — you're right, that's a real bug I introduced. Guarded the call so setBoundedMap only fires on insert and updates to an existing key go through plain .set(). Your snippet essentially verbatim.

Tests — added the partial-revert regression (+10/-0 → +5/-5) asserting the counter ends at added=10, removed=5 (gross) and the gauge tracks added=[10,5], removed=[0,5] (live cumulative), so the semantics gap is pinned down in tests.

bun run lint / typecheck clean, bun test passes 244/245 (the one failure is the unrelated probeEndpoint > returns error for malformed URL case — reproduces on clean main, not touched here).

@gagbo
Copy link
Copy Markdown

gagbo commented May 11, 2026

Hello!

Thanks for having written and published this plugin, first

I’ll try to make it short for all of our sanities: what’s the plan for the PR? I just encountered the same issue (#43), and really all I want is something that matches claude_code semantics so that the data is comparable across AI harnesses.

So if re-focusing the PR (on just the cumulative vs. delta recording) will help merging faster and have it in 0.10.0, I’ll gladly make a branch that has only this fix!

I’m not really blocked since I can just fork the plugin and use it locally by symlinking files, but it’d be cool to have the fix upstream too!

Cheers,
Gerry

@dialupdisaster
Copy link
Copy Markdown
Contributor

The latest commit does address the two implementation issues I called out:

  • sessionDiffTotals no longer evicts unrelated active sessions on update
  • the code/docs/tests now consistently define lines_of_code.count as gross positive churn and lines_of_code.total as the authoritative live cumulative value

One thing still looks out of sync, though: the PR description still says lines_of_code.count now yields the “true net total” / “correct net total”. That no longer matches the implementation or the new tests. The current code explicitly drops negative deltas, so .count does not reconcile back to net after partial or full reverts; only .total does.

Related to gagbo’s question: if the goal is specifically “match Claude Code semantics so data is comparable across AI harnesses,” I’m not yet convinced this PR gets all the way there. It definitely fixes the cumulative-overcounting bug, but it also makes opencode.lines_of_code.count mean “gross forward churn” and introduces a new .total gauge as the authoritative current-state metric. That may be a reasonable design, but it is not obviously the same semantic contract as Claude Code’s published claude_code.lines_of_code.count.

I think it would help to clarify which of these is the intended merge target:

  1. fix the overcounting bug while preserving the closest possible claude_code.lines_of_code.count compatibility story
  2. intentionally define a slightly different opencode-specific model (.count = churn, .total = live cumulative) and document that divergence clearly

At minimum, I’d update the PR body before merge so it matches the current implementation and doesn’t keep claiming “net total” for .count.

@Jeevan1351
Copy link
Copy Markdown
Contributor Author

Jeevan1351 commented May 16, 2026

Just updated the PR description to close out the open questions from review.

@dialupdisaster: the description now leads with the merge target explicitly: option (2), gross-positive-churn counter + authoritative gauge. Option (1) is named and discarded up front, so it's no longer buried in the body. The "net total" framing is gone throughout. The two implementation fixes (read-before-delete on handleSessionError, guarded setBoundedMap insert-only) are both called out. Should be a clean re-read now, happy to hear if anything still looks off.

@gagbo: no need to spin a slimmer branch. The PR is in good shape and there's no intent to scope it down further. On the claude_code parity question: this PR intentionally does not match claude_code.lines_of_code.count semantics, opencode's .count is gross positive churn, which is a different contract. For cross-harness comparison the right metric is lines_of_code.total (the new gauge), which is a live cumulative and the more portable comparison point. That's now stated directly in the description. If exact .count parity with Claude Code becomes a requirement, that's a clean follow-up with its own design discussion.

@dialupdisaster dialupdisaster merged commit 174dba2 into DEVtheOPS:main May 18, 2026
2 checks passed
@dialupdisaster
Copy link
Copy Markdown
Contributor

After doing some testing, this is looking good. Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opencode.lines_of_code.count over-counts on multi-message sessions (adds cumulative totals instead of deltas)

3 participants