Remove unexpected kwargs passing to flce #651

Tcc0403 · 2025-04-03T14:12:29Z

Summary

Resolves #650

**kwargs might contain FlashAttentionKwargs, so we should avoid passing it to flce directly.

This PR also adopts the changes of ForCausalLMLoss and fixed_cross_entropy from huggingface/transformers, and rename softcap to final_logit_softcapping to match the naming.

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Tcc0403 <[email protected]>

Tcc0403 · 2025-04-03T20:28:18Z

Investigating why gemma3 multimodal test failed

@eljandoubi should we relax the tolerance further?

shivam15s · 2025-04-08T18:24:07Z

src/liger_kernel/transformers/model/loss_utils.py

        target,
        reduction=reduction,
        ignore_index=ignore_index,
-        **kwargs,


overall looks good; don't understand why are we not passing kwargs to liger flce anymore

Mainly because **kwargs might contain FlashAttentionKwargs, we might accidently pass them to flce when users set _attn_implementation to flash_attention-2.

We already catch required args by declaring names explicitly in function signature, so the rest is not needed. There are many features (weight, label_smoothing, z-loss, ...) in flce, and I thought they were good to have. But most of them are not supported in transformers, I just remove them for now in case future changes of transformers breaking it again.

Cool, thanks for the explanation

shivam15s

lgtm!

Tcc0403 added 4 commits April 3, 2025 22:02

Remove unexpected kwargs passing to flce

7aa837f

Signed-off-by: Tcc0403 <[email protected]>

Remove unnecessary device operation

2425685

Signed-off-by: Tcc0403 <[email protected]>

Fix type annotation

bbf55f4

Signed-off-by: Tcc0403 <[email protected]>

Add final_logit_softcapping for gemma2

5476a98

Signed-off-by: Tcc0403 <[email protected]>

Tcc0403 requested a review from lancerts April 3, 2025 16:19

Fix final_logit_softcapping in gemma3

0ab4aa0

Signed-off-by: Tcc0403 <[email protected]>

lancerts added 2 commits April 6, 2025 10:07

Merge branch 'main' into tcc/flce_unexpected_kwargs

bc7752d

Merge branch 'main' into tcc/flce_unexpected_kwargs

784759f

shivam15s reviewed Apr 8, 2025

View reviewed changes

shivam15s added 2 commits April 8, 2025 11:24

Merge branch 'main' into tcc/flce_unexpected_kwargs

0fc3280

Merge branch 'main' into tcc/flce_unexpected_kwargs

024cfb9

shivam15s approved these changes Apr 8, 2025

View reviewed changes

Merge branch 'main' into tcc/flce_unexpected_kwargs

641f04a

shivam15s merged commit 7b0669b into main Apr 11, 2025
6 of 7 checks passed

shivam15s deleted the tcc/flce_unexpected_kwargs branch April 11, 2025 22:40

sfc-gh-sbekman mentioned this pull request Apr 21, 2025

new release? #675

Closed

Tcc0403 mentioned this pull request Jul 23, 2025

fixed_fused_linear_cross_entropy should pass through kwargs #832

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove unexpected kwargs passing to flce #651

Remove unexpected kwargs passing to flce #651

Uh oh!

Tcc0403 commented Apr 3, 2025 •

edited

Loading

Uh oh!

Tcc0403 commented Apr 3, 2025 •

edited

Loading

Uh oh!

shivam15s Apr 8, 2025

Uh oh!

Tcc0403 Apr 8, 2025 •

edited

Loading

Uh oh!

shivam15s Apr 8, 2025

Uh oh!

shivam15s left a comment

Uh oh!

Uh oh!

Uh oh!

Remove unexpected kwargs passing to flce #651

Remove unexpected kwargs passing to flce #651

Uh oh!

Conversation

Tcc0403 commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing Done

Uh oh!

Tcc0403 commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shivam15s Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Tcc0403 Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shivam15s Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

shivam15s left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Tcc0403 commented Apr 3, 2025 •

edited

Loading

Tcc0403 commented Apr 3, 2025 •

edited

Loading

Tcc0403 Apr 8, 2025 •

edited

Loading