Skip to Content

How Many Tokens Do You Need for Claude Prompt Caching to Work on System Prompts?

What’s the Minimum Token Length to Cache a Claude System Prompt in the Anthropic API?

Learn the minimum token requirement for caching a Claude system prompt (1024 tokens for supported models) and why shorter prompts won’t cache even with cache_control.

Question

You want to cache your system prompt. What’s the minimum requirement for caching to work?

A. You must make at least 5 requests
B. You must use extended thinking
C. The content must be under 500 tokens
D. The content must be at least 1024 tokens long

Answer

D. The content must be at least 1024 tokens long

Explanation

Prompt caching only works when the cacheable portion of your prompt meets the platform’s minimum cacheable length; if it’s shorter, it won’t be cached even if you set cache_control, and the request is handled normally without caching. For the Claude models covered by the provided docs/guides, that minimum is model-dependent, and 1024 tokens is the relevant minimum threshold reflected in the answer choices (some models require more, such as 4096).