ggml webgpu: actually add softmax, fix rms_norm offset #16400

reeselevine · 2025-10-03T04:13:00Z

In my previous PR (#16357), I missed cherry-picking a commit and ended up missing the code to actually enable soft_max (oops). This PR actually adds it to supports_op and encode_node, as well as fixes a potential bug in rms_norm where it was using the wrong offset into the tensor.

CISC · 2025-10-03T07:50:21Z

Mind you, there appears to be an issue with SOFT_MAX:
https://github.com/ggml-org/llama.cpp/actions/runs/18212653903/job/51856062582?pr=16400#step:7:31662

reeselevine · 2025-10-05T03:59:12Z

fixed the soft max shader and added a temporary fix that blocks on each WebGPU queue submission until I figure out a more efficient way to wait for futures to complete.

implement soft_max

6464269

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Oct 3, 2025

reeselevine mentioned this pull request Oct 3, 2025

webgpu : fix build on emscripten #15826

Draft

CISC approved these changes Oct 3, 2025

View reviewed changes

reeselevine added 2 commits October 3, 2025 13:08

Fix soft_max data race

ee47adb

Temporary fix, wait on each submit

84770b4

reeselevine merged commit 3526657 into ggml-org:master Oct 5, 2025
66 of 68 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml webgpu: actually add softmax, fix rms_norm offset #16400

ggml webgpu: actually add softmax, fix rms_norm offset #16400

reeselevine commented Oct 3, 2025

Uh oh!

CISC commented Oct 3, 2025

Uh oh!

reeselevine commented Oct 5, 2025

Uh oh!

Uh oh!

Uh oh!

ggml webgpu: actually add softmax, fix rms_norm offset #16400

ggml webgpu: actually add softmax, fix rms_norm offset #16400

Conversation

reeselevine commented Oct 3, 2025

Uh oh!

CISC commented Oct 3, 2025

Uh oh!

reeselevine commented Oct 5, 2025

Uh oh!

Uh oh!

Uh oh!