Question

What is prompt caching?

Accepted Answer

Prompt caching reuses the KV cache from the prefix of a previous request when a new request shares the same prefix. This skips redundant prefill computation, dramatically reducing latency and cost for repeated system prompts or long contexts. OpenAI, Anthropic, and Google all support prompt caching in their APIs as of 2026. ---