Skip to content

Conversation

mvpatel2000
Copy link
Contributor

@mvpatel2000 mvpatel2000 commented Jun 20, 2023

What does this PR do?

In-line group to avoid extra ref. This avoids spiking peak GPU memory usage during wrapping. @sashaDoubov will test.

What issue(s) does this change relate to?

CO-2195

@sashaDoubov
Copy link
Contributor

LGTM, we get the same loss curve when using multiple param groups with and without this change, memory usage is hard to debug though.
image
image

@sashaDoubov sashaDoubov merged commit 501c4ae into mosaicml:dev Jun 30, 2023
@mvpatel2000 mvpatel2000 deleted the mvpatel2000/patch-optimizer branch July 1, 2023 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants