You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, truncation only deals with the number of tokens. It doesn't manage the hard character limit (250*n_tokens). This limit seems high but it does fail on some datasets (NQ for example, probably because of URLs contained in documents).
Motivation
It would avoid having truncation logic outside the endpoint.
Your contribution
I could try to open a PR but I'm not fluent in rust sadly.