Skip to content

Conversation

dakinggg
Copy link
Contributor

@dakinggg dakinggg commented Apr 16, 2025

In the latest transformers version, they added a py.typed file, which means that the type checker starts using types from transformers. There are a number of legitimitate type issues surfaced that I tried to correct, but the vast majority result from transformers not actually being a type checked library and a couple specific difficulties:
(1) BatchEncoding (which is what tokenizers return) has a getitem whose return type depends on the type of the index passed in (string vs int). The type checker can't handle this (correct answer would be to have overloads I believe).
(2) PeftModel forwards attr access to the underlying PreTrainedModel by overriding getattr. The type checker can't know this. The correct answer would be to do lots of unpacking of the model before accessing attrs, but I don't think its worth the energy.

Generally, if it wasn't clear to me what to do, I default to type ignore, since that is no worse of a state than before. For reviewers, probably best to skip reviewing all the type ignores, and focus on any places where I actually changed code.

Manual test run: tr-types-2-9qAwxu

@dakinggg dakinggg changed the title tight bump Update transformers to 4.51 Apr 16, 2025
@dakinggg dakinggg marked this pull request as ready for review April 16, 2025 20:54
@dakinggg dakinggg requested review from a team as code owners April 16, 2025 20:54
Copy link
Contributor

@irenedea irenedea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fun PR, thank you! just had a few nits

@dakinggg dakinggg merged commit 60e24b0 into mosaicml:main Apr 16, 2025
11 checks passed
dakinggg added a commit to databricks/compose-rl that referenced this pull request Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants