Question

Why does tokenization matter for cost and latency?

Accepted Answer

Tokenization directly affects both cost and latency because LLM APIs charge by token and inference time scales with token count. A poorly tokenized prompt can use two or three times as many tokens as a well-formatted one. Languages other than English often tokenize less efficiently, which is why Chinese or Arabic prompts tend to cost more per character than English prompts.