Skip to content

Chat Templates With Generation Tags Currently Fail #2762

@nyxkrage

Description

@nyxkrage

Please check that this issue hasn't been reported before.

  • I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

For the generation and endgeneration tags to be ignored, or for the assistant masks produced from them to be used instead of the default turn finding logic.

See for more information about the tags huggingface/transformers#30650

Current behaviour

Axolotl crashes when trying to preprocess any chat_template dataset when the tokenizer has a chat template that includes the generation and endgeneration tags

  File "/home/carsten/axolotl/src/axolotl/prompt_strategies/jinja_template_analyzer.py", line 64, in __init__
    self.ast: nodes.Node = self.env.parse(template)
  File "/home/carsten/axolotl/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 613, in parse
    self.handle_exception(source=source)
  File "/home/carsten/axolotl/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 939, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<unknown>", line 1, in template
jinja2.exceptions.TemplateSyntaxError: Encountered unknown tag 'generation'. Jinja was looking for the following tags: 'elif' or 'else' or 'endif'. The innermost block that needs to be closed is 'if'.

Steps to reproduce

Try to preprocess the given config.

Config yaml

base_model: microsoft/Phi-4-reasoning
model_type: AutoModelForCausalLM

datasets:
  - path: PocketDoc/Dans-Assistantmaxx-sonnetorca-subset
    type: chat_template
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
dataset_prepared_path: prepared_test
val_set_size: 0.0
output_dir: ./output

sequence_len: 4096
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: true

plugins:
  - axolotl.integrations.liger.LigerPlugin
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
liger_rope: true
liger_rms_norm: true
liger_glu_activation: true

gradient_accumulation_steps: 1
micro_batch_size: 1
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 1e-6

bf16: true
gradient_checkpointing: true # offload
resume_from_checkpoint:
logging_steps: 1
flash_attention: true

warmup_ratio: 0.03 
evals_per_epoch:
saves_per_epoch: 8
weight_decay: 0.0
special_tokens:

Possible solution

No response

Which Operating Systems are you using?

  • Linux
  • macOS
  • Windows

Python Version

3.10

axolotl branch-commit

cb03c76

Acknowledgements

  • My issue title is concise, descriptive, and in title casing.
  • I have searched the existing issues to make sure this bug has not been reported yet.
  • I am using the latest version of axolotl.
  • I have provided enough information for the maintainers to reproduce and diagnose the issue.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions