Skip to content

Commit 41617a7

Browse files
Updates Gemma3n MLP layer
Updates the Gemma3n MLP layer to use the intermediate size specified for each layer, addressing a potential issue where the model was not correctly configured based on layer-specific parameters. Signed-off-by: Xinyuan Tong <[email protected]>
1 parent 69183f8 commit 41617a7

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

python/sglang/srt/models/gemma3n_causal.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ class Gemma3nTextScaledWordEmbedding(Gemma3TextScaledWordEmbedding):
6262
pass
6363

6464

65-
class Gemma3nMLP(nn.Module):
65+
class Gemma3nTextMLP(nn.Module):
6666
def __init__(
6767
self,
6868
hidden_size: int,
@@ -514,10 +514,11 @@ def __init__(
514514
prefix=add_prefix("self_attn", prefix),
515515
)
516516

517+
intermediate_size = config.intermediate_size[layer_id]
517518
activation_sparsity = config.activation_sparsity_pattern[layer_id]
518-
self.mlp = Gemma3nMLP(
519+
self.mlp = Gemma3nTextMLP(
519520
hidden_size=self.hidden_size,
520-
intermediate_size=config.intermediate_size,
521+
intermediate_size=intermediate_size,
521522
hidden_activation=config.hidden_activation,
522523
activation_sparsity=activation_sparsity,
523524
quant_config=quant_config,

0 commit comments

Comments
 (0)