Skip to content

Commit 0d1b1fa

Browse files
Fridge003lifuhuang
authored andcommitted
Update batch size limitation of dsv3_router_gemm kernel to 16 (#8051)
1 parent f0ecff7 commit 0d1b1fa

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

python/sglang/srt/models/deepseek_v2.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -252,8 +252,7 @@ def forward(self, hidden_states):
252252
# NOTE: For some unknown reason, router_gemm seems degrade accept length.
253253
if (
254254
_is_cuda
255-
and not self.is_nextn
256-
and hidden_states.shape[0] < 4
255+
and hidden_states.shape[0] <= 16
257256
and hidden_states.shape[1] == 7168
258257
and self.weight.shape[0] == 256
259258
and _device_sm >= 90

0 commit comments

Comments
 (0)