* fix(responses): Add image generation support for Responses API
Fixes#16227
## Problem
When using Gemini 2.5 Flash Image with /responses endpoint, image generation
outputs were not being returned correctly. The response contained only text
with empty content instead of the generated images.
## Solution
1. Created new `OutputImageGenerationCall` type for image generation outputs
2. Modified `_extract_message_output_items()` to detect images in completion responses
3. Added `_extract_image_generation_output_items()` to transform images from
completion format (data URL) to responses format (pure base64)
4. Added `_extract_base64_from_data_url()` helper to extract base64 from data URLs
5. Updated `ResponsesAPIResponse.output` type to include `OutputImageGenerationCall`
## Changes
- litellm/types/responses/main.py: Added OutputImageGenerationCall type
- litellm/types/llms/openai.py: Updated ResponsesAPIResponse.output type
- litellm/responses/litellm_completion_transformation/transformation.py:
Added image detection and extraction logic
- tests/test_litellm/responses/litellm_completion_transformation/test_image_generation_output.py:
Added comprehensive unit tests (16 tests, all passing)
## Result
/responses endpoint now correctly returns:
```json
{
"output": [{
"type": "image_generation_call",
"id": "..._img_0",
"status": "completed",
"result": "iVBORw0KGgo..." // Pure base64, no data: prefix
}]
}
```
This matches OpenAI Responses API specification where image generation
outputs have type "image_generation_call" with base64 data in "result" field.
* docs(responses): Add image generation documentation and tests
- Add comprehensive image generation documentation to response_api.md
- Include examples for Gemini (no tools param) and OpenAI (with tools param)
- Document response format and base64 handling
- Add supported models table with provider-specific requirements
- Add unit tests for image generation output transformation
- Test base64 extraction from data URLs
- Test image generation output item creation
- Test status mapping and integration scenarios
- Verify proper transformation from completions to responses format
Related to #16227
* fix(responses): Correct status type for image generation output
- Add _map_finish_reason_to_image_generation_status() helper function
- Fix MyPy type error: OutputImageGenerationCall.status only accepts
['in_progress', 'completed', 'incomplete', 'failed'], not the full
ResponsesAPIStatus union which includes 'cancelled' and 'queued'
Fixes MyPy error in transformation.py:838
When Gemini image generation models return `text_tokens=0` with `image_tokens > 0`,
the cost calculator was assuming no token breakdown existed and treating all
completion tokens as text tokens, resulting in ~10x underestimation of costs.
Changes:
- Fix cost calculation logic to respect token breakdown when image/audio/reasoning
tokens are present, even if text_tokens=0
- Add `output_cost_per_image_token` pricing for gemini-3-pro-image-preview models
- Add test case reproducing the issue
- Add documentation explaining image token pricing
Fixes#17410
* fix: resolve code quality issues from ruff linter
- Fix duplicate imports in anthropic guardrail handler
- Remove duplicate AllAnthropicToolsValues import
- Remove duplicate ChatCompletionToolParam import
- Remove unused variable 'tools' in guardrail handler
- Replace print statement with proper logging in json_loader
- Use verbose_logger.warning() instead of print()
- Remove unused imports
- Remove _update_metadata_field from team_endpoints
- Remove unused ChatCompletionToolCallChunk imports from transformation
- Refactor update_team function to reduce complexity (PLR0915)
- Extract budget_duration handling into _set_budget_reset_at() helper
- Minimal refactoring to reduce function from 51 to 50 statements
All ruff linter errors resolved. Fixes F811, F841, T201, F401, and PLR0915 errors.
* docs: add missing environment variables to documentation
Add 8 missing environment variables to the environment variables reference section:
- AIOHTTP_CONNECTOR_LIMIT_PER_HOST: Connection limit per host for aiohttp connector
- AUDIO_SPEECH_CHUNK_SIZE: Chunk size for audio speech processing
- CYBERARK_SSL_VERIFY: Flag to enable/disable SSL certificate verification for CyberArk
- LITELLM_DD_AGENT_HOST: Hostname or IP of DataDog agent for LiteLLM-specific logging
- LITELLM_DD_AGENT_PORT: Port of DataDog agent for LiteLLM-specific log intake
- WANDB_API_KEY: API key for Weights & Biases (W&B) logging integration
- WANDB_HOST: Host URL for Weights & Biases (W&B) service
- WANDB_PROJECT_ID: Project ID for Weights & Biases (W&B) logging integration
Fixes test_env_keys.py test that was failing due to undocumented environment variables.
* fix(generic_guardrail_api.py): add 'structured_messages' support
allows guardrail provider to know if text is from system or user
* fix(generic_guardrail_api.md): document 'structured_messages' parameter
give api provider a way to distinguish between user and system messages
* feat(anthropic/): return openai chat completion format structured messages when calls made via `/v1/messages` on Anthropic
* feat(responses/guardrail_translation): support 'structured_messages' param for guardrails
structured openai chat completion spec messages, for guardrail checks when using /v1/responses api
allows guardrail checks to work consistently across APIs
* fix(unified_guardrail.py): correctly map a v1/messages call to the anthropic unified guardrail
* fix: add more rigorous call type checks
* fix(anthropic_endpoints/endpoints.py): initialize logging object at the beginning of endpoint
ensures call id + trace id are emitted to guardrail api
* feat(anthropic/chat/guardrail_translation): support streaming guardrails
sample on every 5 chunks
* fix(openai/chat/guardrail_translation): support openai streaming guardrails
* fix: initial commit fixing output guardrails for responses api
* feat(openai/responses/guardrail_translation): handler.py - fix output checks on responses api
* fix(openai/responses/guardrail_translation/handler.py): ensure responses api guardrails work on streaming
* test: update tests
* test: update tests
* fix: support multiple kinds of input to the guardrail api
* feat(guardrail_translation/handler.py): support extracting tool calls from openai chat completions for guardrail api's
* feat(generic_guardrail_api.py): support extracting + returning modified tool calls on generic_guardrails_api
allows guardrail api to analyze tool call being sent to provider - to run any analysis on it
* fix(guardrails.py): support anthropic /v1/messages tool calls
* feat(responses_api/): extract tool calls for guardrail processing
* docs(generic_guardrail_api.md): document tools param support
* docs: generic_guardrail_api.md
improve documentation
Add Agent Lightning, Microsoft's open-source framework for training
AI agents with RL, APO, and SFT. Uses LiteLLM Proxy for LLM routing
and trace collection.
Both frameworks integrate with LiteLLM:
- Google ADK uses LiteLLM for model-agnostic agent building
- Harbor uses LiteLLM for agent evaluation across providers
* docs: update getting started page
- Add Core Functions table with link to full list
- Add Responses API section
- Add Async section with acompletion() example
- Add "Switch Providers with One Line" example
- Clarify Basic Usage supports multiple endpoints
- Update models to current versions (openai/gpt-4o, anthropic/claude-sonnet-4)
- Use provider/model format throughout
- Fix deprecated import: from openai.error -> from openai
- Keep original structure: community key, More details links, observability env vars
* Cleanup: Remove orphan docs pages and Docusaurus template files
- Remove orphan getting_started.md (not linked in sidebar)
- Remove Docusaurus template intro.md
- Remove tutorial-basics/ directory (Docusaurus template)
- Remove tutorial-extras/ directory (Docusaurus template)
* docs vertex tts
* place vertex ai types in file
* use VertexAITextToSpeechConfig
* use vertex_voice_dict
* refactor docs
* docs vertex ai chirp
* TestVertexAITextToSpeechConfig
* new provider vertex ai chirp3
* test_litellm_speech_vertex_ai_chirp
* add vertex_ai/chirp cost trackign
* docs: add Azure AI Foundry documentation for Claude models
Add documentation explaining how to use Claude models (Sonnet 4.5,
Haiku 4.5, Opus 4.1) deployed on Azure AI Foundry with LiteLLM.
Azure exposes Claude using Anthropic's native API, so users can use
the existing anthropic/ provider with their Azure endpoint.
Closes#17066
* docs: Add alternative method for Azure AI Foundry using anthropic/ provider
Document that users can use anthropic/ provider with Azure endpoint
as an alternative to the dedicated azure_ai/ provider.