v0.6.1
What's Changed
- Fix gemma3 forward with skip_logits by @BitPhinix in #795
- Update README.md by @PKUWZP in #808
- Fix minor typo by @hugoabonizio in #809
- Update README.md by @PKUWZP in #811
- Fix embedding benchmarks for backward pass by @Manan17 in #799
- Giving an option to update benchmark results for previous commits. by @Manan17 in #791
- [Model] Liger support for SmolLM3 by @edbeeching in #798
- FusedAddRMSNorm: Fused residual addition and RMS Norm by @vaibhavjindal in #812
- Skip smollm3 tests in tests-bwd by @vaibhavjindal in #821
- Layernorm enhancement by @Manan17 in #815
- Update README.md by @PKUWZP in #823
- Update index.md by @PKUWZP in #824
- Remove smollm3 import at top of file by @vaibhavjindal in #825
- Fix illegal memory access in Triton RMSNorm kernel by casting program_id to int64 by @vvvdwbvvv in #804
- fix(benchmark): move chunked loss module init out of measurements by @Tcc0403 in #643
- [XPU]Fixed the issue with multiple num_warps parameters being passed in. by @YangKai0616 in #831
- Automate benchmarking - for every release by @Manan17 in #828
- Revert "Bug Fix: name patching for modules" by @vaibhavjindal in #833
- Bug fixes in patching module by @vaibhavjindal in #834
- docs(README): fix gpumode discord badge by @Tcc0403 in #835
- Update pyproject.toml version to 0.6.1 by @shimizust in #838
New Contributors
- @BitPhinix made their first contribution in #795
- @PKUWZP made their first contribution in #808
- @hugoabonizio made their first contribution in #809
- @edbeeching made their first contribution in #798
Full Changelog: v0.6.0...v0.6.1