Skip to content

Conversation

WoosukKwon
Copy link
Collaborator

Should be merged after #53

This PR adds support for the bfloat16 data type, which is used for some LLMs including Dolly V2.

@WoosukKwon WoosukKwon merged commit e070829 into main May 3, 2023
@WoosukKwon WoosukKwon deleted the support-bfloat16 branch May 3, 2023 21:09
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
SUMMARY
* `yapf` format a couple of test files

TEST PLAN:
ran `yapf` in-place locally to get the files updated.
dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this pull request Jul 22, 2024
* adds wvSpltK optimization for skinny gemm.


---------

Co-authored-by: Hashem Hashemi <[email protected]>
JHLEE17 pushed a commit to JHLEE17/vllm that referenced this pull request Aug 1, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025
* Pad flashmla_sparse to 128 on blackwell

* adjust get_max_prefill_buffer_size

* change comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant