-
Notifications
You must be signed in to change notification settings - Fork 13.2k
Feat: Apertus model implementation #15852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pwilkin
wants to merge
36
commits into
ggml-org:master
Choose a base branch
from
pwilkin:apertus-implementation
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,081
−7
Open
Changes from all commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
64add82
First attempt
pwilkin 86cfc18
No permute during convert (fixes qk tensors), proper norm application.
pwilkin 8c762a3
RoPE = NeoX
pwilkin ffdfd1d
Coherence!
pwilkin 74dcf89
Merge branch 'ggml-org:master' into apertus-implementation
pwilkin ab11d94
Migrate xielu params from tensors to hyperparameters
pwilkin 1b18472
Simple CUDA kernel
pwilkin 1606a3c
Revert stupid LLM refactorings
pwilkin eec384f
Chat template support
pwilkin b2a92d0
configchecker / flake8 errors
pwilkin ef9ef66
Reorder unary.cu
pwilkin d009194
I do conclude that LLMs are, in fact, stupid.
pwilkin 73bd64f
Merge branch 'master' into apertus-implementation
pwilkin 28f9086
Fix after merge
pwilkin 2f68c03
Final newline
pwilkin 4294dbf
Make xIELU an UNARY_OP
pwilkin b2aa4fb
Final newline
pwilkin 86a239c
Correctly account for parameter shift
pwilkin bd19026
Argh.
pwilkin 13cc3be
Update ggml/src/ggml-cpu/unary-ops.cpp
pwilkin 40f2f80
Refactor: remove unused methods, inline and factorize softplus, add c…
pwilkin dbe0ccb
Merge branch 'master' into apertus-implementation
pwilkin 0d36139
Revert CUDA changes, implement xIELU as a separate OP
pwilkin fc58d3b
Pesky newline
pwilkin db9eb29
Add float2half / half2float for F16 inputs/outputs
pwilkin dc1e4d5
CUDA variants, attempt 2
pwilkin 6c843ce
Actually, attempt 3
pwilkin f11ab3c
Update ggml/src/ggml-cuda/unary.cu
pwilkin 57e2263
Missing convert header
pwilkin 62401d8
Proper formula and reference for xIELU in the comments.
pwilkin 58e6e0f
Modify unary-ops.cpp to add the functor-based logic besides the templ…
pwilkin 5a02bd4
Apply suggestions from code review
pwilkin 90f052d
Add tensor mappings for Apertus to global list instead
pwilkin 6972404
Fix lazy on scalars
pwilkin d29cb45
Merge remote-tracking branch 'pwilkin/master' into apertus-implementa…
pwilkin 2ca353b
Update ggml/src/ggml-cuda/unary.cu
pwilkin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this now, didn't you work around this already?
Probably belongs in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it turns out the workaround that I had works if you use --remote, but not if you convert from a local repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bugfix and I think it's too tiny for a separate PR anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No such thing. :) It's better for visibility to have separate PRs, but since it's an issue specifically for this model it's probably ok to have it here.