Fix TorchDynamo crash in StaticCache by validating offloading and offload_only_non_sliding arguments #41169

Flakes342 · 2025-09-25T17:43:32Z

What does this PR do?

Fixes #41164

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker @Rocketknight1

Description

This PR addresses a bug in StaticCache that can lead to torch.compile / TorchDynamo crashes when users pass:

Unexpected keyword arguments (**kwargs)
Misordered positional arguments
Non-boolean values for offloading or offload_only_non_sliding

Root causes / API issues:

Positional flexibility vs. keyword expectations
- StaticCache accepts config, max_cache_len as positional arguments, but offloading and offload_only_non_sliding were also sometimes passed positionally.
- This can silently assign non-boolean types (e.g., a device object) to offloading, leading to CUDA/TorchDynamo crashes in update().
Inconsistent argument handling across cache classes
- Base Cache is strict: offloading and offload_only_non_sliding must be booleans.
- StaticCache had a flexible signature with **kwargs, which could allow invalid arguments without immediate error.
- Deprecated children sometimes passed extra arguments through **kwargs, increasing risk of silent misassignment.
Potential user misuse
- Users passing device/dtype arguments positionally (common in examples or from auto-generated code) could crash torch.compile unexpectedly.
- Without clear error messages, this was non-obvious and difficult to debug.

Reproduction

Using Qwen3-omni or similar models:

Here, device and compute_dtype were passed as positional arguments.
offloading is expected to be a bool, but now receives "cuda:0".
This triggers the crash: InternalTorchDynamoError: AttributeError: 'int' object has no attribute 'device'

Solution

Validate types in StaticCache.__init__:
Ensure offloading and offload_only_non_sliding are bool.
Raise a clear TypeError if invalid.
Catch unknown kwargs:
Raise TypeError if users pass unrecognized keyword arguments (**kwargs) to prevent silent misassignment.
Forward **kwargs safely in children:
Deprecated children like OffloadedHybridCache are untouched for now, but any kwargs they pass are safely handled in StaticCache.

Additional Notes

We could optionally enforce keyword-only arguments via * to fully prevent positional misassignment, but this may break backward compatibility.
Deprecated cache classes (OffloadedHybridCache, etc.) are left untouched since they will be removed in v4.59.
A more advanced fix could include runtime coercion of devices to offloading=False to fully prevent user mistakes, but raising errors is safer and explicit.

Flakes342 added 2 commits September 25, 2025 22:56

Guardrails added

26a1ea1

Merge branch 'main' into pr/Statc

3c41fe4

Rocketknight1 mentioned this pull request Sep 26, 2025

cache offloading check is incorrect #41164

Closed

4 tasks

Flakes342 added 2 commits September 26, 2025 18:46

Merge branch 'main' into pr/Statc

00706a2

Merge branch 'main' into pr/Statc

acfadf0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix TorchDynamo crash in StaticCache by validating offloading and offload_only_non_sliding arguments #41169

Fix TorchDynamo crash in StaticCache by validating offloading and offload_only_non_sliding arguments #41169

Flakes342 commented Sep 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Fix TorchDynamo crash in StaticCache by validating offloading and offload_only_non_sliding arguments #41169

Are you sure you want to change the base?

Fix TorchDynamo crash in StaticCache by validating offloading and offload_only_non_sliding arguments #41169

Conversation

Flakes342 commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Description

Root causes / API issues:

Reproduction

Solution

Additional Notes

Uh oh!

Uh oh!

Flakes342 commented Sep 25, 2025 •

edited

Loading