Skip to content
This repository was archived by the owner on May 11, 2025. It is now read-only.

Conversation

younesbelkada
Copy link
Collaborator

What does this PR do?

The recent transformers release that included a cache refactor: huggingface/transformers#26681 broke some internal assumptions in autoawq about the shapes of attention masks and input embeddings.

This PR fixes this issue by simply updating layer_kwargs thanks to the prepare_inputs_for_generation method that automatically takes care of updating the inputs and attention masks in the correct format (should be also compatible with previous versions)

cc @casper-hansen @TheBloke

@casper-hansen casper-hansen merged commit 78b59d7 into main Dec 11, 2023
@younesbelkada younesbelkada deleted the fix-transformers-release branch December 11, 2023 16:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants