Skip to content

Fix Apriel2 mask call for transformers 5.9.0#524

Merged
jlamypoirier merged 1 commit into
mainfrom
jlp_apriel2-mask-cache-position
May 21, 2026
Merged

Fix Apriel2 mask call for transformers 5.9.0#524
jlamypoirier merged 1 commit into
mainfrom
jlp_apriel2-mask-cache-position

Conversation

@jlamypoirier
Copy link
Copy Markdown
Collaborator

Summary

  • transformers 5.9.0 removed the cache_position kwarg from create_causal_mask and create_sliding_window_causal_mask, causing all 60 Apriel2 modeling tests to fail on CI with TypeError: ... unexpected keyword argument 'cache_position'.
  • The kwarg was already deprecated and unused since transformers 5.6 (# not used anymore but kept for BC). It is still required on transformers v4, where it remains functional.
  • Gate the kwarg on the existing _TRANSFORMERS_V4 flag at the single call site in modeling_apriel2.py: passed on v4, dropped on v5+.

Test plan

Ran fast_llm_external_models/tests/test_apriel2/ against three transformers releases — all match the pre-breakage baseline.

  • transformers==4.57.1: 2109 passed, 42 skipped
  • transformers==5.8.1: 2109 passed, 42 skipped
  • transformers==5.9.0 (the CI-resolved version): 2109 passed, 42 skipped — previously 60 failed

🤖 Generated with Claude Code

`create_causal_mask` and `create_sliding_window_causal_mask` removed the
`cache_position` kwarg in transformers 5.9.0 (deprecated and ignored
since 5.6). Gate the kwarg on the existing `_TRANSFORMERS_V4` flag:
pass it on v4 (where it's still required), drop it on v5+.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jlamypoirier jlamypoirier merged commit 10d2906 into main May 21, 2026
2 checks passed
@jlamypoirier jlamypoirier deleted the jlp_apriel2-mask-cache-position branch May 21, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant