[Feature] Allow async model loading and cancellation

In production environment, especially desktop apps, it's common to have a button (or any other way) to allow users to abort the model loading. Fortunately, llama.cpp has already added support for it https://github.com/ggerganov/llama.cpp/pull/4462. I think we should introduce this feature in LLamaSharp.

Similarly, async model loading is also important for applications based on LLamaSharp, which avoids blocking the main thread for a long time when loading a large model. I've found a similar work of it in the node.js binding of llama.cpp https://github.com/withcatai/node-llama-cpp/pull/178. We could also implement it by polling the progress callback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Allow async model loading and cancellation #699

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Allow async model loading and cancellation #699

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions