Skip to content

Conversation

coryMosaicML
Copy link
Contributor

@coryMosaicML coryMosaicML commented Sep 12, 2022

This PR includes a few updates/improvements to EMA along with code cleanup.

Changes:

  • Code cleanup to remove old ShadowModel class as changes to composer have since made it unnecessary.
  • Automatic checkpointing of the ema model weights when a checkpoint is saved.
  • Added option to specify smoothing directly instead of half_life for compatibility with other implementations
  • Added ema_start hyperparameter to specify when ema should start running.
  • Removed train_with_ema_weights option.

Test runs:
ResNet50 Medium (current dev) 79.55%
ResNet50 Medium (after changes) 79.48%
ResNet50 Medium (after changes) smoothing=0.87 (should be similar to half_life=100ba behavior) 79.36%
ResNet50 Medium (after changes) ema_start="0.5dur" 79.49%

@coryMosaicML coryMosaicML marked this pull request as ready for review September 12, 2022 21:54
@coryMosaicML coryMosaicML requested review from a team and dskhudia as code owners September 12, 2022 21:54
Copy link
Contributor

@growlix growlix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for updating the recipes!

Copy link
Contributor

@Landanjs Landanjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Mostly nits. The biggest thing is if we should have defaults for half_life or smoothing. I'm in favor of defaults for half_life, but gets overridden if smoothing is used. There could be a warning thrown. Not convinced this is the best though

@coryMosaicML coryMosaicML enabled auto-merge (squash) September 15, 2022 20:52
@coryMosaicML coryMosaicML merged commit ca6bb5e into mosaicml:dev Sep 15, 2022
bandish-shah pushed a commit that referenced this pull request Sep 19, 2022
bcui19 pushed a commit to bcui19/composer that referenced this pull request Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants