Skip to content

Conversation

zucchini-nlp
Copy link
Member

What does this PR do?

Fixes #41020 and ensures constency in Qwen-VL text config. Side note: prev we had flat dict structure in Qwen and for BC passed kwargs to super() and to text_config. This caused confusion in TRL which apparently resets some text attributes manually when training

In this PR, Qwen will set/get text related attributes only through text config. The attributes are obtainable from nested config as config.text_config.vocab_size and from root as config.vocab_size (BC)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Member Author

Test failures are not related

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arf, very sad we did not use subconfigs when the model was added 🥲

Comment on lines +316 to +317
super().__init__(**kwargs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should stay after setting the subconfigs no? Otherwise the __setattr__ won't work as it won't see the subcinfigs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I added it for extra kwargs that users can add in a flat dict. Otherwise those are getting serialized in text config because we pass all kwargs to the text config

For usage it has no effects, only if someone has non-common kwarg which is not supposed to be in text config. I think supporting correct attention is more important than a rare edge case, so I'll revert it

Comment on lines 341 to 343
return setattr(text_config, key, value)

return super().__setattr__(key, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no return for setattr!

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) September 30, 2025 11:19
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v, glm4v_moe, qwen2_5_vl, qwen2_vl

@zucchini-nlp zucchini-nlp merged commit f22cb1e into huggingface:main Sep 30, 2025
25 checks passed
zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Sep 30, 2025
* fix qwen text config

* fix tests

* fix one more test

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Config rope_scaling and text_config.rope_scaling might be the same or different dict objects
3 participants