-
Notifications
You must be signed in to change notification settings - Fork 512
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[JAX] Load modules during initialize for Norm and Act primitives
#2219
opened Sep 30, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[JAX] Fix
rng_state
shape in fused attention
#2217
opened Sep 30, 2025 by
phu0ngng
Loading…
7 of 13 tasks
[Draft][PyTorch][MOE] Support NVFP4 Grouped Linear
#2215
opened Sep 30, 2025 by
zhongbozhu
Loading…
1 of 17 tasks
[JAX][Draft] Async issuing D2H memcpy for grouped_gemm group_sizes array
#2213
opened Sep 29, 2025 by
huanghua1994
•
Draft
6 of 13 tasks
[Build] fix: TE installation failed to find uv-installed cuDNN libraries
#2207
opened Sep 28, 2025 by
KivenChen
Loading…
8 tasks done
Test to see if SWA ans Causal compute can be removed from seqlens and…
#2201
opened Sep 25, 2025 by
KshitijLakhani
•
Draft
13 tasks
Honor COMPACT data_format for FP8 blockwise scales in MoE up-projection path to remove 5× redundant rowwise_scale_inv.T.contiguous() passes
#2199
opened Sep 24, 2025 by
xiaoxi-wangfj
Loading…
2 of 13 tasks
[PyTorch] fix int32 overflow in permute kernels
#2196
opened Sep 23, 2025 by
hxbai
Loading…
1 of 13 tasks
[PyTorch] Add max_score support for MuonClip
2.9.0
#2195
opened Sep 22, 2025 by
cyanguwa
Loading…
8 of 13 tasks
[Feature] Enable rope application with offsets for training
2.9.0
#2188
opened Sep 19, 2025 by
sudhakarsingh27
Loading…
1 of 13 tasks
Context Parallel integration tests with a transformer layer: BSHD and THD + CP
2.9.0
#2176
opened Sep 16, 2025 by
jomitchellnv
Loading…
7 of 13 tasks
blockwise fp8 weight memory optimization: on-demand columnwise fp8 weight creation
#2168
opened Sep 10, 2025 by
skydoorkai
Loading…
7 of 13 tasks
[Pytorch] Support for Swiglu Activation used in GPT OSS
#2161
opened Sep 8, 2025 by
vthumbe1503
Loading…
8 of 13 tasks
[Common][Pytorch] Add support for the FP8 Block Scaling (ie. Deepseek) recipe on Blackwell
#2157
opened Sep 5, 2025 by
janekb04
Loading…
5 of 13 tasks
[Common][PyTorch][Rework] PDL for Quantization
#2150
opened Sep 4, 2025 by
yaox12
Loading…
1 of 13 tasks
[main][feature][under updating]adapt for offload activation
#2145
opened Sep 2, 2025 by
GeYuhong
Loading…
1 of 13 tasks
[PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor
#2144
opened Sep 2, 2025 by
xiaoxi-wangfj
Loading…
1 of 13 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.