[WIP] Proper tool calling support in the torchtune #2794

krammnic · 2025-06-06T10:11:45Z

Context

What is the purpose of this PR? Is it to

add a new feature
fix a bug
update tests and/or documentation
other (please add here)

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?

tool calling support + modifications for the HFBase/HFModel tokenizers

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
add unit tests for any new functionality
update docstrings for any new or updated methods or classes
run unit tests via pytest tests
run recipe tests via pytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctness
include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

UX

If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example

I did not change any public API
I have added an example to docs or docstrings

pytorch-bot · 2025-06-06T10:11:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2794

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kamalojasv181 · 2025-06-14T21:14:14Z

Hey @krammnic, does this support tool calls for all formats (like openai, sharegpt etc)?

krammnic · 2025-06-14T21:27:08Z

It still WIP, but yes, it will

ebsmothers

Thanks for working on this! I left a few small comments. We should also add a test to ensure that this actually works and generates the expected outputs on a tool-calling dataset

ebsmothers · 2025-06-20T22:33:58Z

torchtune/modules/transforms/tokenizers/_hf_tokenizer.py

        token_ids = self.tokenizer.encode(text).ids
-        if add_bos and not self.hf_adds_bos and self.bos_token not in text:
+
+        # Both bos_id and eos_id might be None (null). Therefore, we need an additional check.


Is this related to tool-calling? Or a separate issue?

It is caused by separate issue in HuggingfaceBaseTokenizer.

ebsmothers · 2025-06-20T22:34:40Z

torchtune/modules/transforms/tokenizers/_hf_tokenizer.py

+            try:
+                self.bos_token = self._get_token_from_config(self.config, "bos_token")
+                self.eos_token = self._get_token_from_config(self.config, "eos_token")
+            except ValueError:
+                pass


In this case I wonder whether we should just modify _get_token_from_config to directly return None (possibly logging a warning) rather than use this try/except

ebsmothers · 2025-06-20T22:42:59Z

torchtune/data/_messages.py

        masked (bool): whether the message is masked in the sample. If True, do not use
            in loss calculation. Default: False
        ipython (bool): whether the message is a tool call. Default: False
+        tool_calls (Optional[list]): list of tool calls related to this message. Default: None


Should we also update the role "ipython" to "tool" to match what's done by Hugging Face?

Yes, good catch, that argument seemed to me weird.

nathan-az · 2025-07-08T01:30:10Z

pyproject.toml

 [project.optional-dependencies]
 dev = [
-    "bitsandbytes>=0.43.0",
+#    "bitsandbytes>=0.43.0",


Was this an intentional removal?

Nope :/ I don't like to do it on Mac (in another case it will not install) and then remove the comment. Will open separate PR to address this

nathan-az · 2025-07-09T03:27:11Z

torchtune/data/_messages.py

            masked=d.get("masked", False),
-            ipython=d.get("ipython", False),
+            tool_calls=d.get("tool_calls", []),
+            tool=d.get("tool", False),


While I agree with this change, it currently breaks existing tokenizers. Repro:

from torchtune.datasets import alpaca_cleaned_dataset from torchtune.models.qwen2_5 import qwen2_5_tokenizer vocab_path = "/tmp/Qwen2.5-14B-Instruct/vocab.json" merges_path = "/tmp/Qwen2.5-14B-Instruct/merges.txt" tokenizer_json_path = "/tmp/Qwen2.5-14B-Instruct/tokenizer.json" tokenizer_config_path = "/tmp/Qwen2.5-14B-Instruct/tokenizer_config.json" tokenizer_qwen = qwen2_5_tokenizer( path=vocab_path, merges_file=merges_path, max_seq_len=512 ) dataset_qwen = alpaca_cleaned_dataset(tokenizer=tokenizer_qwen, packed=False)

Hmm, yep. We need to remove ipython everywhere.

nathan-az · 2025-07-09T04:14:22Z

@krammnic This is great progress. It looks like a bit of work needs to be done on BC with existing tokenizers (unless the plan is to fully deprecate them). In addition I'm seeing some issues with the jinja rendering - I think it may require explicitly passing tools to the renderer (hf ref).

Repro:

from torchtune.datasets import alpaca_cleaned_dataset
from torchtune.modules.transforms.tokenizers import HuggingFaceModelTokenizer


tokenizer_json_path = "/tmp/Qwen2.5-14B-Instruct/tokenizer.json"
tokenizer_config_path = "/tmp/Qwen2.5-14B-Instruct/tokenizer_config.json"

tokenizer_hf = HuggingFaceModelTokenizer(
    tokenizer_json_path=tokenizer_json_path,
    tokenizer_config_json_path=tokenizer_config_path,
    max_seq_len=512,
)
dataset_hf = alpaca_cleaned_dataset(tokenizer=tokenizer_hf, packed=True)

Basically optionally propagating tools might solve this (basic example). I haven't tested beyond seeing that the above repro runs but haven't checked correctness.

nathan-az · 2025-07-22T07:29:48Z

torchtune/modules/transforms/tokenizers/_hf_tokenizer.py

+                    "role": m.role,
+                    "content": m.content[0]["content"],
+                    "tool_calls": m.tool_calls,
+                }


I had some issues specifically with the LLaMA 3(.3) tokenizer here, which didn't play nicely with empty tool calls [] or None. Ended up replacing with:

current_messages = [ { "role": m.role, "content": m.content[0]["content"], **({"tool_calls": m.tool_calls} if m.tool_calls is not None else {}) } for m in messages[: i + 1] ]

I'm not sure if this is the correct logic though (or if this plays nicely with all tokenizers)

This works fine with other tokenizers! Thanks

For me this broke qwen coder (sigh). I think removing strict undefined may have fixed it? I don't recall exactly. I kept trying components from the transformers jinja compile until I found something, but I don't know much about how jinja works - there may be a better solution.

krammnic · 2025-08-03T21:06:25Z

@nathan-az Hey! We might want to merge this. I will introduce some final changes tomorrow and then we will be able to merge. Thanks for the patience and comprehensive review.

nathan-az · 2025-08-04T01:09:17Z

Sounds good. Given the issues I had with other tokenizers I think it would be good to add tests with a couple of the popular models' tokenizers to confirm that:

there are no issues with their templates, and
the tokenization is correct (or correct enough)

Up to you whether we want to put their tokenizers directly in the torchtune source or if we add transformers as a test dependency only. I'm not sure if we have a precedent for this (but it would also cut down the size of torchtune which may be good, and only slightly slow the CI).

Mark Obozov added 2 commits June 6, 2025 13:09

support tokenizers with tool calling + eos/bos (null)

c95266b

lint

ad4940d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 6, 2025

krammnic changed the title ~~[WIP] Tool calling support in the torchtune~~ [WIP] Proper tool calling support in the torchtune Jun 6, 2025

Mark Obozov added 2 commits June 17, 2025 17:03

merge

f0cb86f

__call__ + max_seq_len

fb874b9

ebsmothers reviewed Jun 20, 2025

View reviewed changes

test + few changes

5cf3809

nathan-az reviewed Jul 8, 2025

View reviewed changes

nathan-az reviewed Jul 9, 2025

View reviewed changes

nathan-az reviewed Jul 22, 2025

View reviewed changes

[WIP] Proper tool calling support in the torchtune #2794

Are you sure you want to change the base?

[WIP] Proper tool calling support in the torchtune #2794

Uh oh!

Conversation

krammnic commented Jun 6, 2025

Context

Changelog

Test plan

UX

Uh oh!

pytorch-bot bot commented Jun 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2794

Uh oh!

kamalojasv181 commented Jun 14, 2025

Uh oh!

krammnic commented Jun 14, 2025

Uh oh!

ebsmothers left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathan-az commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krammnic commented Aug 3, 2025

Uh oh!

nathan-az commented Aug 4, 2025

Uh oh!

Uh oh!

nathan-az commented Jul 9, 2025 •

edited

Loading