Skip to content

Releases: mosaicml/composer

v0.32.1

26 Jul 00:26
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.32.0...v0.32.1

v0.32.0

15 Jul 21:58
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.31.0...v0.32.0

v0.31.0

28 May 17:30
Compare
Choose a tag to compare

What's New

1. PyTorch 2.7.0 Compatibility (#3850)

We've added support for PyTorch 2.7.0 and created a Dockerfile to support PyTorch 2.7.0 + CUDA 12.8. The current Composer image supports PyTorch 2.7.0 + CUDA 12.6.3.

2. Experimental FSDP2 support has been added to Trainer (#3852)

Experimental FSDP2 support was added to Trainer with:

  • auto_wrap based on _fsdp_wrap_fn and/or _fsdp_wrap attributes within the model (#3826)
  • Activation checkpointing and CPU offloading (#3832)
  • Meta initialization (#3852)

Note: Not all features are supported yet (e.g. automicrobatching, monolithic checkpointing)

Usage:

Add FSDP_VERSION=2 as an environment variable and set your FSDP2 config (parallelism_config) as desired. The full set of available attributes can be found here.

Bug Fixes

  • Resolve a memory hang issue in Mlflow monitor process (#3830)

What's Changed

New Contributors

Full Changelog: v0.30.0...v0.31.0

v0.30.0

04 Apr 20:22
Compare
Choose a tag to compare

What's New

1. Python 3.12 Bump (#3783)

We've added support for Python 3.12 and deprecated Python 3.9 support.

What's Changed

New Contributors

Full Changelog: v0.29.0...v0.30.0

v0.29.0

25 Feb 00:24
Compare
Choose a tag to compare

Deprecations

1. device_transforms param in DataSpec has been deprecated (#3770)

Composer no longer supports the device_transforms parameter in DataSpec. Instead, DataSpec supports batch_transforms for batch level transformations on CPU and microbatch_transforms for micro-batch level transformations on target device.

What's Changed

New Contributors

Full Changelog: v0.28.0...v0.29.0

v0.28.0

04 Dec 15:51
Compare
Choose a tag to compare

Deprecations

1. Deepspeed Deprecation (#3732)

Composer no longer supports the Deepspeed deep learning library. Support has shifted to PyTorch-native solutions such as FSDP and DDP only. Please use Composer v0.27.0 or before to continue using Deepspeed!

What's Changed

New Contributors

Full Changelog: v0.27.0...v0.28.0

v0.27.0

14 Nov 19:35
Compare
Choose a tag to compare

What's New

1. Torch 2.5.1 Compatibility (#3701)

We've added support for torch 2.5.1, including checkpointing bug fixes from PyTorch.

2. Add batch/microbatch transforms (#3703)

Sped up device transformations by doing batch transform on CPU and microbatch transforms on GPU

Deprecations and Breaking Changes

1. MLFlow Metrics Deduplication (#3678)

We added a metric de-duplication feature for the MLflow logger in Composer. Metrics that remain unchanged since the last step are not logged unless specific conditions are met, which by default is if we've reached a 100th multiple of duplicated metric steps. This optimizes logging storage by reducing redundant entries, balancing detailed sampling with efficiency.

Example:

MlflowLogger(..., log_duplicated_metric_every_n_steps=100)

What's Changed

Full Changelog: v0.26.1...v0.27.0

v0.26.1

01 Nov 06:07
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.26.0...v0.26.1

v0.26.0

25 Oct 21:36
Compare
Choose a tag to compare

What's New

1. Torch 2.5.0 Compatibility (#3609)

We've added support for torch 2.5.0, including necessary patches to Torch.

Deprecations and Breaking Changes

1. FSDP Configuration Changes(#3681)

We no longer support passing fsdp_config and fsdp_auto_wrap directly to Trainer.

If you'd like to specify an fsdp config and configure fsdp auto wrapping, you should use parallelism_config.

trainer = Trainer(
    parallelism_config = {
        'fsdp': { 
            'auto_wrap': True
            ...
        }
    }
)

2. Removal of Pytorch Legacy Sharded Checkpoint Support (#3631)

PyTorch briefly used a different sharded checkpoint format than the current one, which was quickly deprecated by PyTorch. We have removed support for this format. We initially removed support for saving in this format in #2262, and the original feature was added in #1902. Please reach out if you have concerns or need help converting your checkpoints to the new format.

What's Changed

New Contributors

Full Changelog: v0.25.0...v0.26.0

v0.25.0

24 Sep 20:56
Compare
Choose a tag to compare

What's New

1. Torch 2.4.1 Compatibility (#3609)

We've added support for torch 2.4.1, including necessary patches to Torch.

Deprecations and breaking changes

1. Microbatch device movement (#3567)

Instead of moving the entire batch to device at once, we now move each microbatch to device. This saves memory for large inputs, e.g. multimodal data, when training with many microbatches.

This change may affect certain callbacks which run operations on the batch which require it to be moved to an accelerator ahead of time, such as the two changed in this PR. There shouldn't be too many of these callbacks, so we anticipate this change will be relatively safe.

2. DeepSpeed deprecation version (#3634)

We have update the Composer version that we will remove support for DeepSpeed to 0.27.0. Please reach out on GitHub if you have any concerns about this.

3. PyTorch legacy sharded checkpoint format

PyTorch briefly used a different sharded checkpoint format than the current one, which was quickly deprecated by PyTorch. We have continued to support loading legacy format checkpoints for a while, but we will likely be removing support for this format entirely in an upcoming release. We initially removed support for saving in this format in #2262, and the original feature was added in #1902. Please reach out if you have concerns or need help converting your checkpoints to the new format.

What's Changed

Full Changelog: v0.24.1...v0.25.0