mirror of
https://github.com/unslothai/unsloth
synced 2026-04-21 13:37:39 +00:00
BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4426)
Fixes #4150. Pre-PR, `_fix_chat_template` only patched templates where a trailing `{{ ... }}` expression followed the last `{% endfor %}`. ChatML templates (Hermes, Magnum, Phi-4, etc.) that end cleanly at `{% endfor %}` with no generation-prompt block were left unchanged, so the outer `fix_chat_template` raised: ``` RuntimeError: Unsloth: The tokenizer `...` does not have a {% if add_generation_prompt %} for generation purposes. ``` This commonly shows up when a downstream tool (LlamaFactory, Axolotl) re-serializes the tokenizer during LoRA save and strips the generation-prompt block. This PR adds a second branch to `_fix_chat_template` that fires when: - the content after the last `{% endfor %}` is empty modulo Jinja `{# ... #}` comments, - the scrubbed template contains `<|im_start|>` and `<|im_end|>`, - and the scrubbed template does not already mention `add_generation_prompt`. The assistant-turn separator is inferred from the template itself (preferring an explicit `'<|im_start|>assistant<sep>'` literal, then the unique `message['role'] + '<sep>'` from role concatenations, then `<|im_sep|>` for Phi-4-mini mixed-separator templates, then `\n`), so Phi-4-style templates are not silently corrupted with the wrong separator. Verified against the existing chat-template corpus: - Hermes-3, Magnum-v2, Phi-4-mini, Phi-4 multi-sep, ChatML with trailing whitespace, ChatML with trailing Jinja comment, dot-access `message.role`, split-literal `'<|im_start|>assistant'`: all repaired with the correct assistant prefix. - Already-fixed ChatML templates: idempotent NOP. - Trap templates with `<|im_start|>` only inside a Jinja comment: correctly not rewritten. - Llama-3, Gemma-3, Qwen2.5 (non-ChatML): byte-identical. - Mistral family (5 models including Mistral-Nemo, Mistral-Small-24B, Mixtral): byte-identical, protected both by the structural guard (no ChatML tokens) and the existing name-based exemption in `load_correct_tokenizer`. - Qwen family (14 models including Qwen2.5, Qwen3, Qwen3-Coder, QwQ, VL, Math, Qwen3-Guard): byte-identical. End-to-end reproduction: Hermes-3 LoRA SFT, save with stripped chat_template, reload. Pre-PR code path raises the RuntimeError above. Post-PR reload loads cleanly, patches the template at load time, and `apply_chat_template(add_generation_prompt=True)` produces the correct `<|im_start|>assistant\n` prefix.
This commit is contained in:
parent
a4d4dfe4ac
commit
14ab6fbfae
1 changed files with 48 additions and 0 deletions
|
|
@ -677,6 +677,54 @@ def _fix_chat_template(chat_template):
|
|||
)
|
||||
|
||||
chat_template = chat_template[: where + len(chosen_end)] + after_endfor
|
||||
|
||||
elif re.sub(r"\{#.*?#\}", "", after_endfor, flags = re.DOTALL).strip() == "":
|
||||
# GH#4150: ChatML templates ending at {% endfor %} without an
|
||||
# add_generation_prompt block. Scrub Jinja `{# ... #}` comments so
|
||||
# tokens inside comments cannot fool the guard below.
|
||||
scrubbed = re.sub(r"\{#.*?#\}", "", chat_template, flags = re.DOTALL)
|
||||
if (
|
||||
"<|im_start|>" in scrubbed
|
||||
and "<|im_end|>" in scrubbed
|
||||
and "add_generation_prompt" not in scrubbed
|
||||
):
|
||||
# Infer the assistant-turn separator. Prefer an explicit
|
||||
# '<|im_start|>assistant<sep>' literal; else the unique
|
||||
# `message['role'] + '<sep>'` from role concatenations; else
|
||||
# '<|im_sep|>' if present (Phi-4-mini uses '\n' for system and
|
||||
# '<|im_sep|>' for user/assistant); else '\n'.
|
||||
assistant_match = re.search(
|
||||
r"""(['"])<\|im_start\|>assistant([^'"]*)\1""",
|
||||
scrubbed,
|
||||
)
|
||||
role_seps = [
|
||||
m.group(2)
|
||||
for m in re.finditer(
|
||||
r"""message(?:\[['"]role['"]\]|\.role)\s*\+\s*(['"])([^'"]*)\1""",
|
||||
scrubbed,
|
||||
)
|
||||
]
|
||||
unique_role_seps = list(dict.fromkeys(role_seps))
|
||||
if assistant_match is not None and assistant_match.group(2):
|
||||
separator = assistant_match.group(2)
|
||||
elif len(unique_role_seps) == 1:
|
||||
separator = unique_role_seps[0]
|
||||
elif "<|im_sep|>" in scrubbed:
|
||||
separator = "<|im_sep|>"
|
||||
else:
|
||||
separator = "\\n"
|
||||
# Emit a double-quoted Jinja literal so a single quote in the
|
||||
# separator cannot break the block. Drop trailing whitespace/
|
||||
# comments after endfor: they would render as stray output
|
||||
# after the generation prefix.
|
||||
assistant_prefix = "<|im_start|>assistant" + separator
|
||||
generation_block = (
|
||||
"{%" + dash + " if add_generation_prompt %}"
|
||||
'{{ "' + assistant_prefix.replace('"', '\\"') + '" }}'
|
||||
"{%" + dash + " endif %}"
|
||||
)
|
||||
chat_template = chat_template[: where + len(chosen_end)] + generation_block
|
||||
|
||||
return chat_template
|
||||
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue