Skip to content

Memberlist cas error code false positive#7408

Merged
SungJin1212 merged 2 commits intocortexproject:masterfrom
SungJin1212:memberlist-cas-error-code-false-positive
Apr 10, 2026
Merged

Memberlist cas error code false positive#7408
SungJin1212 merged 2 commits intocortexproject:masterfrom
SungJin1212:memberlist-cas-error-code-false-positive

Conversation

@SungJin1212
Copy link
Copy Markdown
Member

@SungJin1212 SungJin1212 commented Apr 9, 2026

getCasErrorCode used a direct type assertion to detect IsOperationAborted(), which fails when the error is wrapped by memberlist's trySingleCas:

  fmt.Errorf("fn returned error: %w", err)          // 1st wrap
  fmt.Errorf("failed to CAS-update key %s: %w", ...) // 2nd wrap

This caused ReplicasNotMatchError (normal HA deduplication) to be reported as status_code="500" in cortex_kv_request_duration_seconds when using memberlist as the HA tracker KV store, triggering false-positive CortexKVStoreFailure alerts.

Note: consul and etcd are unaffected because they return the callback error directly without wrapping (return err), so the direct type assertion worked correctly for those backends.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@SungJin1212 SungJin1212 force-pushed the memberlist-cas-error-code-false-positive branch from 963e067 to 52d8f76 Compare April 9, 2026 05:27
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the memberlist-cas-error-code-false-positive branch from 52d8f76 to 1dea7bd Compare April 9, 2026 05:32
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the memberlist-cas-error-code-false-positive branch from 53f5194 to ebae696 Compare April 9, 2026 05:37
Copy link
Copy Markdown
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Apr 9, 2026
@SungJin1212 SungJin1212 merged commit 6fac9f4 into cortexproject:master Apr 10, 2026
35 checks passed
friedrichg pushed a commit that referenced this pull request Apr 16, 2026
* use errors.As in getCasErrorCode to unwrap memberlist errors

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix test

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>
friedrichg pushed a commit that referenced this pull request Apr 16, 2026
* use errors.As in getCasErrorCode to unwrap memberlist errors

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix test

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>
friedrichg added a commit that referenced this pull request Apr 17, 2026
* Memberlist cas error code false positive (#7408)

* use errors.As in getCasErrorCode to unwrap memberlist errors

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix test

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Fix nil when ingesterQueryMaxAttempts > 1 (#7369)

* Trigger nil with test

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Fix nil results

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* fix changelog

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

---------

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* fix: alertmanager user config disappearing when ring is unreachable  (#7372)

* Fix multitenant alertmanager user config disappearing when ring is unreachable

Signed-off-by: Kishore K G <kishorekg@google.com>

* Add change log

Signed-off-by: Kishore K G <kishorekg@google.com>

* format multitenant

Signed-off-by: Kishore K G <kishorekg@google.com>

* fix pr number

Signed-off-by: Kishore K G <kishorekg@google.com>

* use ErrNotFound for error validation in unit test

Signed-off-by: kishorekg1999 <kishorekg.github@gmail.com>

---------

Signed-off-by: Kishore K G <kishorekg@google.com>
Signed-off-by: kishorekg1999 <kishorekg@google.com>
Signed-off-by: kishorekg1999 <kishorekg.github@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Clean Symbol Tables (#7373)

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Fix root cause of nil return in queryWithRetry and labelsWithRetry (#7375)

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* fix regex resolver match 0 or 1 tenant bug (#7424)

* fix regex resolver match 0 or 1 tenant bug

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix test

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* skip nil values in Memberlist WatchPrefix (#7429)

* skip nil values in Memberlist WatchPrefix

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix lint

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Remove duplicate CHANGELOG entry for #7373

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

* Fix integration test flag name for release-1.21

The cherry-pick of #7424 brought the master flag name
-limits.query-ingesters-within, but release-1.21 still uses
-querier.query-ingesters-within (renamed in #7160, master-only).

Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com>
Signed-off-by: Kishore K G <kishorekg@google.com>
Signed-off-by: kishorekg1999 <kishorekg@google.com>
Signed-off-by: kishorekg1999 <kishorekg.github@gmail.com>
Co-authored-by: SungJin1212 <tjdwls1201@gmail.com>
Co-authored-by: kishorekg1999 <kishorekg.github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/memberlist lgtm This PR has been approved by a maintainer size/M type/bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants