Question

How do you prepare data for fine-tuning a chat model?

Accepted Answer

Format data using the model's chat template, which structures messages as alternating user and assistant turns with role markers. Each example should be a complete conversation, not just a single response. Apply the same template at inference time to match training distribution. Hugging Face tokenizers provide apply_chat_template to handle this consistently across model families.