Add opt-in unversioned fallback for versioned task dispatch#731
Add opt-in unversioned fallback for versioned task dispatch#731torosent wants to merge 15 commits into
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The original test design had a ~100ms margin between the second item write and the silent-disconnect timer expiry (500ms timeout, 400ms inter-item delay, ~150ms thread scheduling). Under CI load, thread-pool scheduling jitter could push the second item write past the timer expiry, causing the consumer to return SilentDisconnect instead of GracefulDrain. Scale the timings up so both the with-reset and without-reset margins are ~500ms instead of ~100ms, and synchronize the test on the second item actually being dequeued before completing the channel. This removes the secondary race between writing the second item and the consumer dequeuing it. The test still verifies the same behavior: that ArmSilentDisconnectTimer is called per item so the original timer does not fire while items are still arriving. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous doc for 'Never' described when an unversioned registration IS used, but the enum name 'Never' reads as 'never used' — making the text confusing. Reframe both members in terms of fallback semantics (matching the enum name 'UnversionedFallbackMode') and make 'Never' explicitly say what's preserved (unversioned requests still served, fallback still applies when no versioned registration exists for the name). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in worker setting to allow unversioned task registrations to serve as a catch-all for unmatched versioned orchestration/activity dispatch, enabling more practical long-running version migration patterns while preserving the existing closed-set behavior by default.
Changes:
- Introduces
DurableTaskWorkerOptions.UnversionedFallbackMode+VersioningOptions.UnversionedFallbackand propagates it viaUseVersioning(...). - Updates
DurableTaskFactorydispatch rules andDurableTaskWorkerWorkItemFiltersgeneration to support the fallback behavior (including backend filter widening only when applicable). - Adds warning logging and expands unit/integration tests plus a new end-to-end sample.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test/Worker/Grpc.Tests/WorkItemStreamConsumerTests.cs | Makes heartbeat reset test more robust against CI timing jitter. |
| test/Worker/Grpc.Tests/GrpcDurableTaskWorkerTests.cs | Adds a strict-versioning guard test ensuring mismatch rejects before factory dispatch (even with fallback enabled). |
| test/Worker/Core.Tests/DurableTaskFactoryVersioningTests.cs | Verifies orchestrator unknown-version dispatch falls back to unversioned when opt-in is enabled. |
| test/Worker/Core.Tests/DurableTaskFactoryActivityVersioningTests.cs | Verifies activity unknown-version dispatch falls back to unversioned when opt-in is enabled. |
| test/Worker/Core.Tests/DependencyInjection/UseWorkItemFiltersTests.cs | Validates filter wildcard emission for mixed registrations when fallback is enabled, and strict versioning behavior. |
| test/Worker/Core.Tests/DependencyInjection/DefaultDurableTaskWorkerBuilderTests.cs | Ensures enabling fallback emits a warning log during worker build. |
| src/Worker/Core/Logs.cs | Adds warning log (EventId 606) for unversioned fallback enablement and replay-safety caveat. |
| src/Worker/Core/DurableTaskWorkerWorkItemFilters.cs | Emits wildcard version lists for names that can accept unknown versions via unversioned fallback. |
| src/Worker/Core/DurableTaskWorkerOptions.cs | Adds UnversionedFallbackMode enum and VersioningOptions.UnversionedFallback with replay-compat guidance. |
| src/Worker/Core/DurableTaskRegistryExtensions.cs | Builds factories with awareness of the configured unversioned fallback mode. |
| src/Worker/Core/DurableTaskFactory.cs | Enables optional unversioned catch-all dispatch for unmatched versioned requests. |
| src/Worker/Core/DependencyInjection/DurableTaskWorkerBuilderExtensions.cs | Propagates UnversionedFallback through UseVersioning(...). |
| src/Worker/Core/DependencyInjection/DefaultDurableTaskWorkerBuilder.cs | Logs warning when fallback is enabled and passes worker options into factory construction. |
| src/Abstractions/TaskOptions.cs | Updates documentation to reflect opt-in unversioned fallback behavior. |
| samples/UnversionedFallbackSample/UnversionedFallbackSample.csproj | Adds a new runnable sample project demonstrating the feature. |
| samples/UnversionedFallbackSample/README.md | Documents how to run the sample and expected behavior/output. |
| samples/UnversionedFallbackSample/Program.cs | Implements end-to-end DTS emulator demonstration of exact-match vs fallback dispatch. |
| README.md | Adds the new sample to the repository’s versioning sample list. |
| Microsoft.DurableTask.sln | Adds the new sample project to the solution. |
Remove three pre-existing warnings on the two files this PR touches: - DurableTaskWorkerWorkItemFilters.cs: disambiguate the cref to UseWorkItemFilters by specifying the parameterless overload (was CS0419 ambiguous reference between the two extension method overloads). - DurableTaskWorkerOptions.cs: drop the extra blank line before ApplyTo (was SA1507 multiple-blank-lines). - DurableTaskWorkerOptions.cs: scope a #pragma warning disable CS0618 around the internal OrchestrationFilter forwarding in ApplyTo. The obsolete-annotated property still needs to be carried across options instances, but suppressing the warning is correct because ApplyTo is the experimental property's owner-side forwarding path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Posted on my behalf after a review pass with GitHub Copilot CLI. The lens here is API ergonomics and customer confusion — not correctness. The implementation itself looks sound; these are concerns about how the surface will land with users in the wild. A few things I'd like the author and reviewers to consider before this ships, roughly in order of likely user impact: 1. The
|
The dts-emulator image binds to port 8080 by default — confirmed by samples/DistributedTracingSample/docker-compose.yml which runs the same image without ASPNETCORE_URLS. Drop the redundant '-e ASPNETCORE_URLS=http://+:8080' so this sample's instructions stay minimal. Addresses review feedback at PR #731#discussion_r3284373466. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…per dispatch Addresses review feedback on the unversioned-fallback API surface: 1. Rename UnversionedFallbackMode.WhenNoExactMatch -> CatchAll. The original name was technically accurate but did not capture the actual delta from the existing implicit fallback (which also fires 'when no exact match' for unversioned-only names). CatchAll matches the user mental model — the unversioned registration is used as the catch-all for unmatched versioned requests on mixed names. 2. Add a scenario table in <remarks> on UnversionedFallbackMode making the boundary cases between Never and CatchAll unmistakable (unversioned-only / mixed-with-match / mixed-without-match / versioned-only). 3. Split VersioningOptions.UnversionedFallback into OrchestratorUnversionedFallback and ActivityUnversionedFallback (both default Never). Replay risk is overwhelmingly orchestrator-side; the split lets users enable activity fallback (safer) without committing to orchestrator fallback (risky). Adding this split later would be a breaking change, so doing it before the feature ships. 4. Add per-dispatch Debug logs in DurableTaskFactory: EventId 608 for orchestrator dispatch to the unversioned registration, EventId 609 for activity dispatch. Pluming an optional ILoggerFactory through the new BuildFactory(workerOptions, loggerFactory) overload; existing overloads forward. 5. Split the existing EventId 606 warning into two: 606 for orchestrator fallback enabled, 607 for activity fallback enabled. Each fires only when the corresponding side is on; both fire when both are on. 6. Add <remarks> blocks on both new properties documenting interactions with MatchStrategy (Strict / CurrentOrOlder / None) and clarifying that the setting applies regardless of MatchStrategy (unlike FailureStrategy). Plumbing changes: - DurableTaskFactory ctor takes the two split flags + optional ILoggerFactory. - DurableTaskRegistryExtensions.BuildFactory gains a loggerFactory overload. - DefaultDurableTaskWorkerBuilder resolves ILoggerFactory and passes it to BuildFactory; emits one or both warnings depending on which flags are on. - DurableTaskWorkerWorkItemFilters reads each flag independently for its respective filter set (orchestrator/activity), so widening one side does not implicitly widen the other. - UseVersioning extension propagates both new fields. Sample updates: - samples/UnversionedFallbackSample enables both flags (it specifically demos orchestrator catch-all) but the README calls out activity-only as the safer starting point. Tests: - Renamed existing tests to use CatchAll + the new split properties. - Added orchestrator-fallback-only test (verifies activity dispatch stays closed-set for mixed names) and the symmetric activity-only test. - Added per-dispatch Debug log assertions for both orchestrator and activity fallback paths. - Refactored the CapturingLoggerFactory test helper out of DefaultDurableTaskWorkerBuilderTests into a shared file so it can be used by the factory tests too. - Added Build_WithActivityUnversionedFallback_LogsActivityWarning and Build_WithBothFallbacksEnabled_LogsBothWarnings to exercise the split warning paths. All Worker.Tests (144) and Worker.Grpc.Tests (136) pass locally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. DurableTaskWorkerWorkItemFilters.cs: the class doc said callers can pass 'either explicit filters or auto-generated filters' but the <see cref> only pointed at the parameterless overload. Reference both UseWorkItemFilters overloads so the cref matches the prose. 2. WorkItemStreamConsumerTests.cs (PerItem_HeartbeatReset_KeepsTimerAlive): the timing comment said '1000ms + 1500ms = 2500ms total before the 2nd item is written,' but the test also awaits firstItemProcessed between the two delays, which adds variable processing time. Clarify the comment to say 'at least 2500ms (plus the time to process the 1st item).' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TaskOptions lives in the client-side Abstractions layer (Worker references Abstractions, not the reverse), so referencing worker-side dispatch rules in this XML doc is both a layering smell and impossible to keep accurate as worker dispatch evolves (e.g. the new OrchestratorUnversionedFallback / ActivityUnversionedFallback options). The pre-existing wording even predates the unversioned-fallback feature — the leak is older than this PR and just got slightly worse when this PR added 'unless unversioned fallback is explicitly enabled on the worker.' Keep only the genuinely client-side scheduling semantics: - null = inherit from the scheduling orchestration instance (unchanged). - non-null = explicit override of the inherited version (unchanged). - Replace the dispatch-contract sentence with an honest 'the worker resolves the scheduled version according to its own versioning configuration; that resolution is not visible to the scheduling client.' Addresses review point #7 on PR #731 (client-side visibility / TaskOptions.Version leak). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous commit removed the misleading worker-side dispatch contract from TaskOptions.Version, but Chris's drift concern was broader: the same TaskOptions.Version value can dispatch to different registrations in different deployments depending on each worker's UnversionedFallback configuration, and there is no schedule-time signal. Add a fourth <para> framing this as a deployment-time policy decision and pointing users at the worker's startup and per-dispatch diagnostic logs (EventIds 606/607 startup, 608/609 per-dispatch — kept textual so the client-side doc doesn't reference worker-side constants) for verification. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/Worker/Core/DurableTaskRegistryExtensions.cs:12
- The PR description calls out
DurableTaskRegistryExtensions.BuildFactory(options, loggerFactory?)as part of the public API surface, but the containing typeDurableTaskRegistryExtensionsisinternal(no access modifier), so external consumers cannot call this overload. Either make the extension classpublic(if intended to be public API) or update the PR description/docs to reflect that this is internal-only plumbing.
/// <summary>
/// Extensions for <see cref="DurableTaskRegistry" />.
/// </summary>
static class DurableTaskRegistryExtensions
{
The previous doc claimed that with MatchStrategy = Strict, the fallback setting 'has no observable effect.' That's too strong: Strict only rejects instance versions that don't equal the worker's configured Version. For instances whose version passes the gate, factory dispatch still occurs, and CatchAll can still affect the outcome if the registry has no exact-version match for that (name, version) pair but does have an unversioned registration. Worked example showing the gap: - Worker version = 1.0, MatchStrategy = Strict, OrchestratorUnversionedFallback = CatchAll - Registry: [X v=2.0] + [X] (unversioned) - Instance arrives with version '1.0' - Strict gate passes (instance version == worker version) - Factory lookup '(X, 1.0)' misses; with CatchAll the unversioned [X] is used. The new wording matches the existing CurrentOrOlder bullet's framing: 'Fallback does not bypass the gate, but can still apply post-gate to instances that pass it but lack an exact-version registration.' Addresses Copilot review #4348162532 inline comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses review point #8 on PR #731 (Never* asterisk). The original Never name was misleading because the long-standing implicit unversioned-only fallback continued to fire under that mode; Never only disabled the new opt-in catch-all behavior on mixed names. This commit makes the asterisk go away in two ways: 1. Rename Never -> Implicit. Implicit honestly describes the actual behavior: 'fall back implicitly when the name has no versioned siblings.' Numeric value 0 is preserved. An [Obsolete] alias keeps the old Never name source-compatible for anyone building against the preview branch. 2. Add StrictExactOnly = 2. This mode disables EVERY fallback path including the implicit one — versioned requests must have an exact (name, version) registration. Use case: callers want stale or bogus version values from upstream clients to fail loudly instead of silently landing on the unversioned implementation. Final dispatch matrix for versioned requests: | Registry shape | Implicit | CatchAll | StrictExactOnly | | -------------- | -------- | -------- | --------------- | | Only unversioned | unversioned | unversioned | not found | | Mixed, exact match | exact | exact | exact | | Mixed, no exact match | not found | unversioned | not found | | Only versioned, no exact match | not found | not found | not found | Unversioned requests are unaffected by the mode — they always dispatch to the unversioned registration when one exists (exact-match path with the empty-version key). Implementation: - DurableTaskFactory stores the full UnversionedFallbackMode per side (orchestrator/activity) instead of two bools. A small static helper ShouldUseUnversionedFallback(mode, versionedNames, requestedName) centralizes the dispatch decision. - DurableTaskRegistryExtensions defaults changed from Never -> Implicit. - DurableTaskWorkerWorkItemFilters.GetFilterVersions now takes the full mode. Under StrictExactOnly the filter emits the concrete registered version set instead of widening to a wildcard for unversioned-only names — otherwise the backend would deliver versioned work items the factory will reject after the fact. - DefaultDurableTaskWorkerBuilder warnings still gated on CatchAll only; StrictExactOnly is an opt-in tightening, not a risky relaxation. Known limitation (documented in the OrchestratorUnversionedFallback remarks and covered by WorkItemFilters_StrictMatchOverridesStrictExactOnly_KnownLimitation): when combined with MatchStrategy = Strict, the existing strict-override emits the worker's configured Version for every name regardless of whether the factory can resolve it. Under StrictExactOnly this can result in the backend delivering work items the worker will reject. A proper fix requires per-name dispatch-capability analysis and is out of scope here. The behavior is captured by the new test so it doesn't regress silently. Rubber-duck review consulted before commit; per-side independence test strengthened (asymmetric explicit modes + mixed activity registry) per that feedback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add explicit bullets describing Implicit / CatchAll / StrictExactOnly so readers seeing the sample understand the full mode space, not just the CatchAll one this sample exercises. Tighten the UseWorkItemFilters note to reflect the new per-mode filter widening (Implicit and CatchAll widen for the relevant shapes; StrictExactOnly emits the concrete version set). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| /// explicit filters or auto-generated filters from the <see cref="DurableTaskRegistry"/>. | ||
| /// <see cref="DurableTaskWorkerBuilderExtensions.UseWorkItemFilters(IDurableTaskWorkerBuilder)"/> for the | ||
| /// auto-generated filters from the worker's <see cref="DurableTaskRegistry"/>, or | ||
| /// <see cref="DurableTaskWorkerBuilderExtensions.UseWorkItemFilters(IDurableTaskWorkerBuilder, DurableTaskWorkerWorkItemFilters?)"/> |
| public enum UnversionedFallbackMode | ||
| { | ||
| /// <summary> | ||
| /// Preserve the long-standing implicit fallback: the unversioned registration serves versioned requests | ||
| /// only when the task name has no versioned siblings. Once a name has at least one versioned | ||
| /// registration, an unmatched versioned request returns "not found" rather than dispatching to the | ||
| /// unversioned registration. This is the default and matches behavior prior to per-task versioning. | ||
| /// </summary> | ||
| Implicit = 0, | ||
|
|
||
| /// <summary> | ||
| /// Use the unversioned registration as a catch-all when no exact versioned match exists, even when | ||
| /// the task name has versioned siblings. An exact versioned match still wins. Use only when the | ||
| /// unversioned implementation is replay-compatible with every version it may receive. | ||
| /// </summary> | ||
| CatchAll = 1, | ||
|
|
||
| /// <summary> | ||
| /// Require an exact <c>(name, version)</c> registration for every versioned request. Versioned | ||
| /// requests for names without an exact registration return "not found" even when an unversioned | ||
| /// registration for the same name exists. Use this mode when stale or bogus version values from | ||
| /// upstream clients should fail loudly instead of landing on the unversioned registration. | ||
| /// </summary> | ||
| StrictExactOnly = 2, | ||
|
|
||
| /// <summary> | ||
| /// Obsolete alias for <see cref="Implicit"/>. Prefer <see cref="Implicit"/> — the original | ||
| /// <c>Never</c> name was misleading because it did not actually disable the long-standing implicit | ||
| /// unversioned-only fallback. Only <see cref="StrictExactOnly"/> disables every fallback path. | ||
| /// </summary> | ||
| [Obsolete("Use UnversionedFallbackMode.Implicit instead. The original name was misleading; only StrictExactOnly actually disables every fallback path.")] | ||
| Never = Implicit, | ||
| } |
The previous commit added an [Obsolete] Never = Implicit alias on the rubber-duck's source-compat reasoning. That reasoning doesn't apply here: the Never name only ever existed on the unmerged unversioned-fallback branch, never in a shipped NuGet, so there is no external consumer to protect. Keeping the alias would permanently encode a pre-release name and ship two names for the same value from day one, both of which make the public surface harder to read for every future user. The remaining commit messages and the draft reply to Chris still describe the rename Never -> Implicit accurately — Never simply never became a public symbol. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Intent
Per-task versioning today is a closed set for mixed registrations: once a task name has any
[DurableTask(Version = "...")]registration, a request for a version that has no exact match returns "not found" instead of dispatching to a co-registered unversioned class. That makes the natural migration pattern — one current unversioned implementation that handles every live version, plus a small number of pinned legacy classes for specific old versions — impossible to express without registering every observed version explicitly, which is impractical for long-running orchestrations whose histories span many versions in the wild.This PR introduces an
UnversionedFallbackModeenum with three modes (configured independently for orchestrators and activities) that lets each worker choose how the unversioned registration participates in versioned dispatch:Implicit(default) — the long-standing implicit fallback: the unversioned registration serves versioned requests only when the task name has no versioned siblings. Matches pre-versioning behavior.CatchAll— opt-in catch-all: the unversioned registration also serves unmatched versioned requests on mixed names. Exact match still wins.StrictExactOnly— strict mode: every versioned request requires an exact(name, version)registration. Stale or bogus version values fail loudly instead of landing on the unversioned implementation.Dispatch matrix for a versioned request
[X]Unversioned requests (no version specified) always dispatch to the unversioned registration when one exists; they're an exact-match on the empty-version key and the mode does not affect them.
What changes
Public API surface
DurableTaskWorkerOptions.UnversionedFallbackMode(new enum):Implicit(default),CatchAll(opt-in catch-all),StrictExactOnly(no fallback ever). A 4-row dispatch table appears on the enum's<remarks>.DurableTaskWorkerOptions.VersioningOptions.OrchestratorUnversionedFallback(new property): per-side mode for orchestrator dispatch. Replay risk is highest on this side; the property's<remarks>lays out interactions withMatchStrategy(StrictandCurrentOrOlderare pre-dispatch gates; this setting governs how instances that pass the gate are resolved).DurableTaskWorkerOptions.VersioningOptions.ActivityUnversionedFallback(new property): per-side mode for activity dispatch. Lower risk because activities are stateless; main concern is input contract compatibility.Internal wiring
DurableTaskRegistryExtensions.BuildFactory(options, loggerFactory?)(new internal overload): plumbs an optionalILoggerFactoryinto the factory so per-dispatch fallback diagnostics can be emitted. Called byDefaultDurableTaskWorkerBuilder; not part of the public API surface.DurableTaskFactory: ctor takes both per-side modes + optionalILoggerFactory; orchestrator and activity dispatch each consult their own mode via a smallShouldUseUnversionedFallbackhelper.DurableTaskWorkerWorkItemFilters: orchestration filter readsOrchestratorUnversionedFallback; activity filter readsActivityUnversionedFallback. UnderStrictExactOnlythe filter emits the concrete registered version set instead of widening to a wildcard for unversioned-only names — otherwise the backend would deliver versioned work items the factory rejects after the fact.UseVersioningextension propagates both new fields.Logging (EventIds 606–609)
OrchestratorUnversionedFallback = CatchAllat worker buildActivityUnversionedFallback = CatchAllat worker buildLegacyImplorCatchAllImpl?"StrictExactOnlyis an opt-in tightening and does not get a warning at startup.Sample
UnversionedFallbackSampleenables bothCatchAllflags (it specifically demos catch-all dispatch), but the README explicitly calls out activity-only as the safer starting point and explains the asymmetric replay risk.What stays the same (compatibility)
Implicit(numeric value 0), preserving the long-standing implicit-fallback behavior that existed before this PR.MatchStrategy = Strict: instance-version mismatches are rejected inEvaluateOrchestrationVersioningbefore the factory is consulted.CatchAllon, a request for an explicitly registered version dispatches to that exact implementation, never to the unversioned catch-all.Known limitation
When combined with
MatchStrategy = Strict, the existing strict-override emits the worker's configuredVersionfor every name regardless of whether the factory can resolve it. UnderStrictExactOnlythis can result in the backend delivering work items the worker rejects. A proper fix requires per-name dispatch-capability analysis and is out of scope here. The behavior is captured byWorkItemFilters_StrictMatchOverridesStrictExactOnly_KnownLimitationand documented in theOrchestratorUnversionedFallback<remarks>.Safety guidance
The orchestrator-side unversioned implementation must be replay-compatible with every version that may reach it. Replaying an existing orchestration history against an incompatible implementation can cause non-determinism faults or deserialization failures. Activity-side risk is limited to input-contract compatibility because activities are stateless. The startup warning logs (EventId 606 / 607) and per-property XML docs both call this out.
Validation
dotnet build samples/UnversionedFallbackSample/UnversionedFallbackSample.csproj --configuration Debug --verbosity minimalDURABLE_TASK_SCHEDULER_CONNECTION_STRING='Endpoint=http://localhost:8080;TaskHub=default;Authentication=None'dotnet test test/Worker/Core.Tests/Worker.Tests.csproj --configuration Debug --verbosity minimal→ 153/153 passdotnet test test/Worker/Grpc.Tests/Worker.Grpc.Tests.csproj --configuration Debug --verbosity minimal→ 136/136 passgit diff --check