-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Closed
Description
Background
Last week we issued an RFC for supporting frameworks other than PyTorch, with priority on supporting MindSpore on Ascend NPU: #7941.
After some discussions, we are issuing the tentative Roadmap for 2025 H2.
Overall Design
- [July] Proof of concept of PyTorch/MindSpore coexistence
- [July] Design Documentation
Inter-Framework Compatibility
- [August] Tensor memory sharing through DLPack
- [August] Compatibility of PyTorch/MindSpore distributed environment
- [August] Resource sharing of PyTorch/MindSpore (stream reuse, memory pool, etc.)
MindSpore Model Support
- [August] Radix Attention support on NPU
- [August] Qwen3 dense models
- [September] Qwen3 MoE model
- [September] DeepSeek V3/R1 model family
SGLang Features
- [September] Combinations of Data/Tensor/Pipeline/Expert Parallels
- [September] Speculative Decoding
- [September] PD disaggregation
- [September] Quantization
- [September] LoRA
CI on Ascend NPU
- [September] CI tests
User / Developer Experience
- [August] Benchmark results and profiling tools
- [August] Docker image
- [September] Documentations (installation, quickstart, tutorials, etc.)
Long-Term Plans
- [Q4] Further optimizations and more MindSpore models
Comments and suggestions are welcome!
Swipe4057 and huskyisdog