v0.5.6: Enhancements, Fixes, and Expanded Support (Paligemma, DyT, XPU, Llava, GRPO, and More!)
What's Changed
- [JSD] JSD fixes by @kashif in #609
- Paligemma support by @eljandoubi in #608
- Fix hidden size by @eljandoubi in #612
- Add loss_utils for rewriting lce_forward methods by @Tcc0403 in #614
- Update Star History URL by @ryankert01 in #616
- Update README.md by @shivam15s in #617
- language model of paligemma 1 is gemma 1. by @eljandoubi in #613
- Update README to reflect recent changes by @helloworld1 in #619
- Support Dynamic Tanh (DyT) by @Tcc0403 in #618
- Fix incorrect module name when monkey_patch applied to instantiated model by @vaibhavjindal in #629
- [chunked loss] align teacher and student logit shape by @yundai424 in #634
- Fix incorrect condition comment in log_target calculation by @p81sunshine in #633
- Add huggingface llava by @jp1924 in #524
- fix Llava test-bwd failure by @jp1924 in #639
- Fix GRPO to conform with TRL: Fix loss, make tests accurate, correct metrics computation by @shivam15s and @mRSun15 in #628
- add xpu tuning to CE by @mgrabban in #645
- add xpu tuning to FLJSD by @mgrabban in #647
- Change tests to use rocm 6.3 version and tol changes to make liger run on amd by @shivam15s in #646
- Update pyproject.toml by @shivam15s in #648
New Contributors
- @eljandoubi made their first contribution in #608
- @p81sunshine made their first contribution in #633
Full Changelog: v0.5.5...v0.5.6