Skip to content

Conversation

ravi-mosaicml
Copy link
Contributor

Implemented issue #135.
Also renamed total_batch_size to train_batch_size. Updated hparams.

…ize`

Implemented issue mosaicml#135.
Also renamed `total_batch_size` to `train_batch_size`. Updated hparams.
@ravi-mosaicml
Copy link
Contributor Author

@anisehsani -- it would be awesome if each evaluator could support subset num batches

# but the getter will always return a Precision enum
precision: Union[str, types.Precision] # type: ignore
_precision: types.Precision = field(init=False) # but store an enum internally
steps_per_epoch: int = -1 # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If steps_per_epoch is a property then it shouldn't need to be defined here too right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea...it's nonstandard.

Here, I defined it as a field so you can set steps_per_epoch as part of the __init__. But it's optional since it has a default value.

__init_ implicitly calls the property setter, which then sets the private variable _steps_per_epoch. The public getter steps_per_epoch always returns an int -- the actual length of the dataloader (if _steps_per_epoch is None), or the artificially reduced length (if the _steps_per_epoch is not None)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like so far there isn't a reason to set steps_per_epoch to the __init__ when constructing State though so what about removing it for now and then if a need arises it can be added back? Just b/c it's confusing as is

Copy link
Contributor

@ajaysaini725 ajaysaini725 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

# but the getter will always return a Precision enum
precision: Union[str, types.Precision] # type: ignore
_precision: types.Precision = field(init=False) # but store an enum internally
steps_per_epoch: int = -1 # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like so far there isn't a reason to set steps_per_epoch to the __init__ when constructing State though so what about removing it for now and then if a need arises it can be added back? Just b/c it's confusing as is

@ravi-mosaicml ravi-mosaicml merged commit 6d8498c into mosaicml:dev Dec 8, 2021
@ravi-mosaicml ravi-mosaicml deleted the ravi/i135 branch December 8, 2021 00:13
coryMosaicML pushed a commit to coryMosaicML/composer that referenced this pull request Feb 23, 2022
…ize` (mosaicml#137)

1. Remove the `subset_num_batches` from the dataset hparams. Synthetic datasets should instead use the length of the real dataset as the size, or have a configurable size
2. Add `train_subset_num_batches` and `eval_subset_num_batches` to the trainer hparams
3. Add a check in the trainer that ensures that, if this field is set, then `DatasetHparams.shuffle is False`, or otherwise emit a warning that every epoch may be using a different subset of samples
4. Renamed `total_batch_size` to `train_batch_size`. Updated hparams.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants