You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* ggerganov/master: (86 commits)
server : fix building and simplify lib deps on Windows (ggml-org#1772)
talk-llama : sync llama.cpp
talk-llama : llama.cpp
sync : ggml
metal : correctly set SIMD support flags on iOS (llama/4923)
2-bit quantizations (llama/4897)
scripts : sync-ggml-am.sh add option to skip commits
talk-llama : sync llama.cpp
sync : ggml
examples : adapt to metal API
ggml: cache sin/cos for RoPE (llama/4908)
metal : remove old API (llama/4919)
metal : disable log for loaded kernels (llama/4794)
gguf : fix potential infinite for-loop (llama/4600)
metal : refactor kernel loading code (llama/4794)
CUDA: faster q8_0 -> f16 dequantization (llama/4895)
talk-llama : add optional CLI arg to set the bot name (ggml-org#1764)
examples : add python example for transcription (ggml-org#1744)
whisper : load the model into multiple buffers of max size 1GB (ggml-org#1763)
talk-llama : sync llama.cpp
...
0 commit comments