feat: use the best compute layer available by default #175

giladgd · 2024-03-01T23:29:32Z

Description of change

feat: detect the available compute layers on the system and use the best one by default
feat: more guardrails to not load an incompatible prebuilt binary, to prevent process crashes due to linux distro differences
feat: improve logs as to why system-related issues occur and how to fix them
feat: inspect command
feat: add GemmaChatWrapper
feat: TemplateChatWrapper - easier method to create simple chat wrappers, see the type docs for more info
fix: adapt to llama.cpp breaking change
fix: when a specific compute layer is requested, fail the build if it is not found
fix: return user-defined llama tokens
docs: update more docs to prepare for version 3.0

Fixes #160
Fixes #169

How to use `node-llama-cpp` after this change

node-llama-cpp will now detect the available compute layers on the system and use the best one by default.
If the best one fails to load, it'll try the next best option until it manages to load the bindings.

To use this logic, just use getLlama without specifying the compute layer:

import {getLlama} from "node-llama-cpp";

const llama = await getLlama();

To force it to load a specific compute layer, you can use the gpu parameter on getLlama:

import {getLlama} from "node-llama-cpp";

const llama = await getLlama({
    gpu: "vulkan" // defaults to `"auto"`. can also be `"cuda"` or `false` (to not use the GPU at all)
});

To inspect what compute layers are detected in your system, you can run this command:

npx --no node-llama-cpp inspect gpu

If this command fails to find CUDA or Vulkan although using getLlama with gpu set to one of them works, please open an issue so I can investigate it

Pull-Request Checklist

Code is up-to-date with the master branch
npm run format to apply eslint formatting
npm run test passes with this change
This pull request links relevant issues as Fixes #0000
There are new or updated unit tests validating the change
Documentation has been updated to reflect this change
The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

…not available

src/cli/commands/InspectCommand.ts

ido-pluto

LGTM

ido-pluto

LTGM

github-actions · 2024-03-03T22:24:32Z

🎉 This PR is included in version 3.0.0-beta.13 🎉

The release is available on:

Your semantic-release bot 📦🚀

github-actions · 2024-09-24T18:12:45Z

🎉 This PR is included in version 3.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

giladgd added 6 commits March 2, 2024 01:04

feat: detect and use the best GPU by default

ce7497c

feat: inspect command

30e9bfe

fix: fail the build on specific GPU type build when that GPU type is …

bfe70ad

…not available

fix: adapt to llama.cpp breaking change

a8a88e4

docs: improve docs and prepare for version 3.0

ae57364

build: adapt CI to new interface

6676e6c

giladgd requested a review from ido-pluto March 1, 2024 23:29

giladgd self-assigned this Mar 1, 2024

giladgd added 3 commits March 2, 2024 01:53

build: configuring CI

a7cca41

fix: return user defined tokens

98fff0a

build: configuring CI

4087e4c

giladgd mentioned this pull request Mar 2, 2024

EOS token is not detected properly for some models after upgrading to v3.0 #169

Closed

3 tasks

giladgd linked an issue Mar 2, 2024 that may be closed by this pull request

EOS token is not detected properly for some models after upgrading to v3.0 #169

Closed

3 tasks

giladgd mentioned this pull request Mar 2, 2024

Fail to run in docker image #160

Closed

3 tasks

giladgd linked an issue Mar 2, 2024 that may be closed by this pull request

Fail to run in docker image #160

Closed

3 tasks

giladgd added 7 commits March 2, 2024 17:31

fix: bugs

58fd685

fix: bugs

fc08cc5

feat: add GemmaChatWrapper

94232d1

docs: add ESM explanation to the troubleshooting guide

874bcce

fix: bugs

bb6cbca

feat: TemplateChatWrapper

b0ee11a

test: fix test

cbee5af

ido-pluto requested changes Mar 3, 2024

View reviewed changes

src/cli/commands/InspectCommand.ts Show resolved Hide resolved

ido-pluto reviewed Mar 3, 2024

View reviewed changes

ido-pluto approved these changes Mar 3, 2024

View reviewed changes

fix: detect code bundling to skip testBindingBinary

e8d7a29

giladgd merged commit 5a70576 into beta Mar 3, 2024

giladgd deleted the gilad/autoGpuDetection branch March 3, 2024 21:46

github-actions bot added the released on @beta label Mar 3, 2024

giladgd mentioned this pull request Mar 3, 2024

feat: version 3.0 #105

Merged

17 tasks

giladgd added this to the v3.0.0 milestone Mar 3, 2024

giladgd mentioned this pull request Mar 16, 2024

feat: async operations #178

Merged

7 tasks

giladgd mentioned this pull request Jul 28, 2024

feat: Llama 3.1 support, Phi-3 support #273

Merged

7 tasks

github-actions bot added the released label Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: use the best compute layer available by default #175

feat: use the best compute layer available by default #175

Uh oh!

giladgd commented Mar 1, 2024 •

edited

Loading

Uh oh!

Uh oh!

ido-pluto left a comment

Uh oh!

ido-pluto left a comment

Uh oh!

github-actions bot commented Mar 3, 2024

Uh oh!

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading

Uh oh!

Uh oh!

Uh oh!

feat: use the best compute layer available by default #175

feat: use the best compute layer available by default #175

Uh oh!

Conversation

giladgd commented Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of change

How to use node-llama-cpp after this change

Pull-Request Checklist

Uh oh!

Uh oh!

ido-pluto left a comment

Choose a reason for hiding this comment

Uh oh!

ido-pluto left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 3, 2024

Uh oh!

github-actions bot commented Sep 24, 2024 • edited by giladgd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

giladgd commented Mar 1, 2024 •

edited

Loading

How to use `node-llama-cpp` after this change

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading