Config rope_scaling and text_config.rope_scaling might be the same or different dict objects

### System Info

- `transformers` version: 4.57.0.dev0
- Platform: Linux-5.15.0-153-generic-x86_64-with-glibc2.31
- Python version: 3.12.9
- Huggingface_hub version: 0.34.4
- Safetensors version: 0.6.2
- Accelerate version: 1.4.0
- Accelerate config: 	not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.8.0+cu128 (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: no

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

After investigation of an issue in `trl`, I found a weird behavior of `transformers` config `rope_scaling`: the `config.rope_scaling` (at the root config level) and `config.text_config.rope_scaling` (under `text_config`) might be the same or different dict objects depending on whether we pass `text_config` param to `AutoConfig.from_pretrained`
- if we don't pass `text_config` param, the 2 `rope_scaling` point to the same dict object
- if we pass `text_config` param, the 2 `rope_scaling` are different dict objects
```python
In [1]: from transformers import AutoConfig

In [2]: model_id = "Qwen/Qwen2.5-VL-3B-Instruct"

In [3]: config1 = AutoConfig.from_pretrained(model_id)
In [4]: config1.text_config.rope_scaling
Out[4]: {'type': 'default', 'mrope_section': [16, 24, 24], 'rope_type': 'default'}
In [5]: config1.rope_scaling
Out[5]: {'type': 'default', 'mrope_section': [16, 24, 24], 'rope_type': 'default'}
In [6]: id(config1.text_config.rope_scaling)
Out[6]: 140211029392000
In [7]: id(config1.rope_scaling)
Out[7]: 140211029392000
# Both are the same dict object

In [8]: config2 = AutoConfig.from_pretrained(model_id, text_config={})
In [9]:config2.text_config.rope_scaling
Out[9]: {'type': 'default', 'mrope_section': [16, 24, 24], 'rope_type': 'default'}
In [10]: config2.rope_scaling
Out[10]: {'type': 'default', 'mrope_section': [16, 24, 24], 'rope_type': 'default'}
In [11]: id(config2.text_config.rope_scaling)
Out[11]: 140210801100608
In [12]: id(config2.rope_scaling)
Out[12]: 140211029786688
# Both are different dict objects
```

Is this expected?

We discovered this while investigating why changing (after initialization) the `config.text_config.rope_scaling` will or will not change the `config.rope_scaling` as well. See related comment in `trl` PR:
- https://github.com/huggingface/trl/pull/4101#issuecomment-3305590742

### Expected behavior

- Either they should be the same dict object in any case
- Or they should be different dict objects in any case

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Config rope_scaling and text_config.rope_scaling might be the same or different dict objects #41020

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Config rope_scaling and text_config.rope_scaling might be the same or different dict objects #41020

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions