-
Notifications
You must be signed in to change notification settings - Fork 30.6k
[Modeling] Fix encoder CPU offloading for whisper #38994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Modeling] Fix encoder CPU offloading for whisper #38994
Conversation
Signed-off-by: Kyle Sayers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I like that we removed direct access to weight.data
. Can you also un-skip offload tests in whisper and make sure they are green?
For ex:
transformers/tests/models/whisper/test_modeling_whisper.py
Lines 3357 to 3368 in 21cb353
@unittest.skip(reason="Some undefined behavior encountered with tiny versions of this model. Skip for now.") | |
def test_cpu_offload(self): | |
pass | |
@unittest.skip(reason="Some undefined behavior encountered with tiny versions of this model. Skip for now.") | |
def test_disk_offload_bin(self): | |
pass | |
@unittest.skip(reason="Some undefined behavior encountered with tiny versions of this model. Skip for now.") | |
def test_disk_offload_safetensors(self): | |
pass | |
Signed-off-by: Kyle Sayers <[email protected]>
@zucchini-nlp @SunMarc Tests unskipped and passing! |
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
run-slow: whisper |
This comment contains run-slow, running the specified jobs: models: ['models/whisper'] |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@zucchini-nlp Does this test failure indicate something to fix, or is this test noisy?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kylesayrs Can you rebase/merge? The failing tests are expected, no worries :D
@vasqu Merged, thank to hear it :) |
Thanks @kylesayrs 🤗 |
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <[email protected]> * unskip offloading tests Signed-off-by: Kyle Sayers <[email protected]> * revert small change Signed-off-by: Kyle Sayers <[email protected]> * remove tests Signed-off-by: Kyle Sayers <[email protected]> --------- Signed-off-by: Kyle Sayers <[email protected]>
Purpose
Without this change, attempting to CPU offload the encoder layer raises a device error
Changes
embed_positions.weight
attribute directly, leverage the hf hooks attached to theembed_positions
module to onload the weight properly.F.embedding
must be called with an identity matrix, rather than grabbing the weight value directlyTesting
Use the following test script to verify that generation works with the device map
test_whisper_offload.py
Potential Reviewers