Provider-Aware Prompt Caching
Superdav AI Agent v1.12.0 introduces provider-aware prompt caching, which optimizes API costs and latency by caching prompts across different LLM providers. Each provider has different caching mechanisms and configurations.
Overview
Prompt caching allows you to:
- Cache large, frequently-used prompts
- Reduce API costs by avoiding redundant processing
- Improve latency for cached requests
- Manage cache lifecycle explicitly
Different providers implement caching differently:
- Google Gemini:
cachedContentsAPI - Azure OpenAI: Prompt caching with TTL
- OpenRouter: Provider-specific caching
- Vertex Anthropic: Prompt caching with cache control
Google Gemini: cachedContents API
Google Gemini provides explicit cache management via the cachedContents API.
Configuration
$config = [
'provider' => 'google-gemini',
'model' => 'gemini-2.0-flash',
'caching' => [
'enabled' => true,
'ttl' => 3600, // 1 hour in seconds
'max_tokens' => 1000000, // Max tokens to cache
],
];