Fix type for model_init_kwargs when passed as CLI JSON string#5230
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
|
I remember that you have to avoid mixing types in configurations because of the parser. For example, |
qgallouedec
left a comment
There was a problem hiding this comment.
lgtm.
https://github.com/huggingface/trl/pull/5230/changes#r2896039161 needs to be addressed but it can be done in a follow-up PR
Fix type for model_init_kwargs when passed as CLI JSON string.
transformersThis PR fixes the type annotation for the
model_init_kwargsfield across multiple configuration classes in the codebase. The change expands the allowed types to includestrin addition todict[str, Any]andNone, taking into account how model initialization arguments can be provided via CLI JSON string.Updates to type annotations for model initialization:
Changed
model_init_kwargstype fromdict[str, Any] | Nonetodict[str, Any] | str | Nonein the following configuration classes to support string input:DPOConfigGRPOConfigRewardConfigRLOOConfigSFTConfigAnd experimental:
BCOConfigCPOConfigKTOConfigOnlineDPOConfigORPOConfigNote
Low Risk
Low risk type/arg-parsing change limited to configuration dataclasses; main impact is how
model_init_kwargscan be provided via CLI and may affect downstream code expecting a dict if strings are passed programmatically.Overview
Expands CLI support for
model_init_kwargsby updating multiple trainer config dataclasses (core andexperimental/*) to typemodel_init_kwargsasdict[str, Any] | str | None, matchingtransformersbehavior for JSON-string CLI inputs.Adds
_VALID_DICT_FIELDSsupport toOnlineDPOConfigand tightens typing inGRPOConfig/RLOOConfigso JSON-string values are recognized/parsed consistently across configs.Written by Cursor Bugbot for commit da17956. This will update automatically on new commits. Configure here.