Provider-Aware Prompt Caching
Superdav AI Agent v1.12.0 introduces provider-aware prompt caching, which optimizes API costs and latency by caching prompts across different LLM providers. Each provider has different caching mechanisms and configurations.
Overview
Prompt caching allows you to:
- Cache large, frequently-used prompts
- Reduce API costs by avoiding redundant processing
- Improve latency for cached requests
- Manage cache lifecycle explicitly
Different providers implement caching differently:
- Google Gemini:
cachedContentsAPI - Azure OpenAI: Prompt caching with TTL
- OpenRouter: Provider-specific caching
- Vertex Anthropic: Prompt caching with cache control
Google Gemini: cachedContents API
Google Gemini provides explicit cache management via the cachedContents API.
Configuration
$config = [
'provider' => 'google-gemini',
'model' => 'gemini-2.0-flash',
'caching' => [
'enabled' => true,
'ttl' => 3600, // 1 hour in seconds
'max_tokens' => 1000000, // Max tokens to cache
],
];
Creating a Cached Prompt
use Superdav\AI\Providers\GoogleGemini;
$gemini = new GoogleGemini( $config );
$cached_content = $gemini->create_cached_content(
[
'system_prompt' => 'You are a helpful assistant...',
'context' => 'Large context document...',
'ttl' => 3600,
]
);
// Returns: ['cache_id' => 'abc123', 'expires_at' => timestamp]
Using a Cached Prompt
$response = $gemini->generate(
[
'cache_id' => 'abc123',
'prompt' => 'User question here',
]
);
Cache Lifecycle
// List cached contents
$caches = $gemini->list_cached_contents();
// Get cache details
$cache = $gemini->get_cached_content( 'abc123' );
// Extend cache TTL
$gemini->update_cached_content(
'abc123',
['ttl' => 7200] // Extend to 2 hours
);
// Delete cache
$gemini->delete_cached_content( 'abc123' );