-
Notifications
You must be signed in to change notification settings - Fork 454
Quality of life updates to EMA #1524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for updating the recipes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Mostly nits. The biggest thing is if we should have defaults for half_life or smoothing. I'm in favor of defaults for half_life
, but gets overridden if smoothing is used. There could be a warning thrown. Not convinced this is the best though
This PR includes a few updates/improvements to EMA along with code cleanup.
Changes:
ShadowModel
class as changes to composer have since made it unnecessary.smoothing
directly instead ofhalf_life
for compatibility with other implementationsema_start
hyperparameter to specify when ema should start running.train_with_ema_weights
option.Test runs:
ResNet50 Medium (current dev) 79.55%
ResNet50 Medium (after changes) 79.48%
ResNet50 Medium (after changes) smoothing=0.87 (should be similar to half_life=100ba behavior) 79.36%
ResNet50 Medium (after changes) ema_start="0.5dur" 79.49%