Skip to content

Expire stale refs in ExpireSnapshots#3246

Open
alessandro-nori wants to merge 1 commit intoapache:mainfrom
alessandro-nori:anori/expire_refs
Open

Expire stale refs in ExpireSnapshots#3246
alessandro-nori wants to merge 1 commit intoapache:mainfrom
alessandro-nori:anori/expire_refs

Conversation

@alessandro-nori
Copy link
Copy Markdown
Contributor

@alessandro-nori alessandro-nori commented Apr 16, 2026

Changes

  • New remove_expired_refs(default_max_ref_age_ms) method on ExpireSnapshots: removes branches/tags whose snapshot age exceeds their max_ref_age_ms (per-ref setting, with fallback to the method parameter); main is never removed
  • older_than() is now lazy — threshold is evaluated in _commit() so that call order with remove_expired_refs() does not affect the result
  • Updated _get_protected_snapshot_ids() to exclude refs marked for removal, so their orphaned snapshots become eligible for age-based expiry

Motivation

Mirrors Java's RemoveSnapshots.computeRetainedRefs() behavior. Previously, ExpireSnapshots never removed stale refs even when max-ref-age-ms was configured on a branch or tag.


This PR was AI-assisted.

@alessandro-nori alessandro-nori force-pushed the anori/expire_refs branch 3 times, most recently from 73e24a0 to 8373d34 Compare April 16, 2026 08:37
MIN_SNAPSHOTS_TO_KEEP = "history.expire.min-snapshots-to-keep"
MIN_SNAPSHOTS_TO_KEEP_DEFAULT = 1

MAX_REF_AGE_MS = "history.expire.max-ref-age-ms"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used in this PR but I thought it could be useful for callers of ExpireSnapshot to get the default value for max-ref-age-ms from table properties (like in the Java implementation)

for snapshot in self._transaction.table_metadata.snapshots:
if snapshot.timestamp_ms < expire_from and snapshot.snapshot_id not in protected_ids:
self._snapshot_ids_to_expire.add(snapshot.snapshot_id)
self._expire_older_than_ms = datetime_to_millis(dt)
Copy link
Copy Markdown
Contributor Author

@alessandro-nori alessandro-nori Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved most of the logic to _commit() to avoid order-dependence between older_than() and
remove_expired_refs() calls

table_v2.catalog.commit_table.return_value = mock_response

# Remove any refs that protect the snapshots to be expired
table_v2.metadata = table_v2.metadata.model_copy(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed because it duplicates old line 205?

@alessandro-nori alessandro-nori force-pushed the anori/expire_refs branch 5 times, most recently from b45c6fe to dcfdd04 Compare April 16, 2026 08:59
@alessandro-nori alessandro-nori marked this pull request as ready for review April 16, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant