Skip to content

Conversation

rithwik-db
Copy link
Contributor

@rithwik-db rithwik-db commented Apr 17, 2025

What does this PR do?

Modified existing code to support _fsdp_wrap and _fsdp_wrap_fn similar to how FSDP1 handles it (with the slight caveat that we allow None to be a valid input for skipping the current module and checking descendants).

Reworked module wrapping so that we do a recursive check where we pre-order check the legalization and validation of params to make sure nothing that shouldn't be weight tied is actually tied and then we do a post-order fully_shard on valid parameters.

@rithwik-db rithwik-db requested a review from bowenyang008 April 17, 2025 23:56
@rithwik-db rithwik-db changed the title [WIP] Add check for _fsdp_wrap for FSDP2 Add check for _fsdp_wrap for FSDP2 Apr 23, 2025
@rithwik-db rithwik-db requested a review from dakinggg April 23, 2025 03:29
Copy link
Contributor

@bowenyang008 bowenyang008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, thanks @rithwik-db!

@rithwik-db rithwik-db changed the title Add check for _fsdp_wrap for FSDP2 Support submodule wrapping for FSDP2 according to model definition (with _fsdp_wrap and fsdp_wrap_fn) Apr 29, 2025
@rithwik-db rithwik-db enabled auto-merge (squash) April 29, 2025 00:12
@rithwik-db rithwik-db merged commit 3c29f1a into main Apr 29, 2025
14 checks passed
@rithwik-db rithwik-db deleted the add_fsdp_wrapping branch April 29, 2025 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants