4856 Commits

Author SHA1 Message Date
Cesar Garcia
87f94172a9 fix(responses): Add image generation support for Responses API (#16586)
* fix(responses): Add image generation support for Responses API

Fixes #16227

## Problem
When using Gemini 2.5 Flash Image with /responses endpoint, image generation
outputs were not being returned correctly. The response contained only text
with empty content instead of the generated images.

## Solution
1. Created new `OutputImageGenerationCall` type for image generation outputs
2. Modified `_extract_message_output_items()` to detect images in completion responses
3. Added `_extract_image_generation_output_items()` to transform images from
   completion format (data URL) to responses format (pure base64)
4. Added `_extract_base64_from_data_url()` helper to extract base64 from data URLs
5. Updated `ResponsesAPIResponse.output` type to include `OutputImageGenerationCall`

## Changes
- litellm/types/responses/main.py: Added OutputImageGenerationCall type
- litellm/types/llms/openai.py: Updated ResponsesAPIResponse.output type
- litellm/responses/litellm_completion_transformation/transformation.py:
  Added image detection and extraction logic
- tests/test_litellm/responses/litellm_completion_transformation/test_image_generation_output.py:
  Added comprehensive unit tests (16 tests, all passing)

## Result
/responses endpoint now correctly returns:
```json
{
  "output": [{
    "type": "image_generation_call",
    "id": "..._img_0",
    "status": "completed",
    "result": "iVBORw0KGgo..."  // Pure base64, no data: prefix
  }]
}
```

This matches OpenAI Responses API specification where image generation
outputs have type "image_generation_call" with base64 data in "result" field.

* docs(responses): Add image generation documentation and tests

- Add comprehensive image generation documentation to response_api.md
  - Include examples for Gemini (no tools param) and OpenAI (with tools param)
  - Document response format and base64 handling
  - Add supported models table with provider-specific requirements

- Add unit tests for image generation output transformation
  - Test base64 extraction from data URLs
  - Test image generation output item creation
  - Test status mapping and integration scenarios
  - Verify proper transformation from completions to responses format

Related to #16227

* fix(responses): Correct status type for image generation output

- Add _map_finish_reason_to_image_generation_status() helper function
- Fix MyPy type error: OutputImageGenerationCall.status only accepts
  ['in_progress', 'completed', 'incomplete', 'failed'], not the full
  ResponsesAPIStatus union which includes 'cancelled' and 'queued'

Fixes MyPy error in transformation.py:838
2025-12-05 15:56:26 -08:00
Cesar Garcia
829b06f53f Fix: Gemini image_tokens incorrectly treated as text tokens in cost calculation (#17554)
When Gemini image generation models return `text_tokens=0` with `image_tokens > 0`,
the cost calculator was assuming no token breakdown existed and treating all
completion tokens as text tokens, resulting in ~10x underestimation of costs.

Changes:
- Fix cost calculation logic to respect token breakdown when image/audio/reasoning
  tokens are present, even if text_tokens=0
- Add `output_cost_per_image_token` pricing for gemini-3-pro-image-preview models
- Add test case reproducing the issue
- Add documentation explaining image token pricing

Fixes #17410
2025-12-05 15:55:38 -08:00
Yuichiro Utsumi
d18e489872 fix(docs): remove source .env (#17466)
Remove `source .env` since `docker compose` automatically loads
the `.env` file.

Signed-off-by: utsumi.yuichiro <utsumi.yuichiro@fujitsu.com>
2025-12-05 15:53:05 -08:00
Ishaan Jaff
f02df3035a [Feat] Allow using dynamic rate limit/priority reservation on teams (#17061)
* use helper to get key/team priority

* test_team_metadata_priority

* docs team priority
2025-12-05 15:42:27 -08:00
Sameer Kankute
43914796d6 fix failing vertex tests 2025-12-06 00:04:04 +05:30
Krrish Dholakia
c272741d7f docs: fix strings 2025-12-05 09:37:22 -08:00
Krrish Dholakia
c1cbe6ed56 docs: document tool calls spec 2025-12-05 09:37:22 -08:00
Sameer Kankute
558c8f92d1 Merge pull request #17519 from BerriAI/litellm_cursor_integration
Add support for cursor BYOK with its own configuration
2025-12-05 22:23:45 +05:30
Alexsander Hamir
0c017f376c fix: code quality issues from ruff linter (#17536)
* fix: resolve code quality issues from ruff linter

- Fix duplicate imports in anthropic guardrail handler
  - Remove duplicate AllAnthropicToolsValues import
  - Remove duplicate ChatCompletionToolParam import

- Remove unused variable 'tools' in guardrail handler

- Replace print statement with proper logging in json_loader
  - Use verbose_logger.warning() instead of print()

- Remove unused imports
  - Remove _update_metadata_field from team_endpoints
  - Remove unused ChatCompletionToolCallChunk imports from transformation

- Refactor update_team function to reduce complexity (PLR0915)
  - Extract budget_duration handling into _set_budget_reset_at() helper
  - Minimal refactoring to reduce function from 51 to 50 statements

All ruff linter errors resolved. Fixes F811, F841, T201, F401, and PLR0915 errors.

* docs: add missing environment variables to documentation

Add 8 missing environment variables to the environment variables reference section:
- AIOHTTP_CONNECTOR_LIMIT_PER_HOST: Connection limit per host for aiohttp connector
- AUDIO_SPEECH_CHUNK_SIZE: Chunk size for audio speech processing
- CYBERARK_SSL_VERIFY: Flag to enable/disable SSL certificate verification for CyberArk
- LITELLM_DD_AGENT_HOST: Hostname or IP of DataDog agent for LiteLLM-specific logging
- LITELLM_DD_AGENT_PORT: Port of DataDog agent for LiteLLM-specific log intake
- WANDB_API_KEY: API key for Weights & Biases (W&B) logging integration
- WANDB_HOST: Host URL for Weights & Biases (W&B) service
- WANDB_PROJECT_ID: Project ID for Weights & Biases (W&B) logging integration

Fixes test_env_keys.py test that was failing due to undocumented environment variables.
2025-12-05 08:40:49 -08:00
Sameer Kankute
c8fbcc7f1c add tutorial as well 2025-12-05 12:32:23 +05:30
Sameer Kankute
acc0b5fe27 Merge pull request #17362 from BerriAI/litellm_vertex-bge-cherrypick
[Feat] VertexAI - Add BGE Embeddings support
2025-12-05 11:53:42 +05:30
Krish Dholakia
b3a3081e8e Guardrails API - new structured_messages param (#17518)
* fix(generic_guardrail_api.py): add 'structured_messages' support

allows guardrail provider to know if text is from system or user

* fix(generic_guardrail_api.md): document 'structured_messages' parameter

give api provider a way to distinguish between user and system messages

* feat(anthropic/): return openai chat completion format structured messages when calls made via `/v1/messages` on Anthropic

* feat(responses/guardrail_translation): support 'structured_messages' param for guardrails

structured openai chat completion spec messages, for guardrail checks when using /v1/responses api

allows guardrail checks to work consistently across APIs
2025-12-04 22:08:00 -08:00
Krish Dholakia
8776336c3c Enable detailed debugging for reference (#17508)
* Deprecate set_verbose in favor of LITELLM_LOG

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Update debugging documentation links

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-04 21:51:56 -08:00
Sameer Kankute
392e5059b0 Add steps to add litellm proxy in cursor 2025-12-05 10:02:42 +05:30
Sameer Kankute
01ee46b493 Add steps to add litellm proxy in cursor 2025-12-05 10:01:48 +05:30
Sameer Kankute
4d83a48b59 Add steps to add litellm proxy in cursor 2025-12-05 09:39:58 +05:30
Sameer Kankute
a6006e698c Add support for cursor BYOK with its own configuration 2025-12-05 09:34:49 +05:30
Ishaan Jaffer
4f3b843efe docs openai 2025-12-04 18:32:23 -08:00
Ishaan Jaff
b2e8d3fd42 [Feat] Allow adding OpenAI compatible chat providers using .json + add public ai provider (#17448)
* feat: Add JSON config for OpenAI-compatible providers

Co-authored-by: ishaan <ishaan@berri.ai>

* feat: Add simple JSON config for OpenAI-compatible providers

Co-authored-by: ishaan <ishaan@berri.ai>

* feat: Implement JSON-based provider config and migrate PublicAI

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* docs fix

* undo change

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: ishaan <ishaan@berri.ai>
2025-12-04 17:59:25 -08:00
Ishaan Jaff
fadfbb13d3 [Docs] A2a - Permission management (#17515)
* docs add a2a gateway + mcp gateway

* docs a2a permissions

* docs a2a permission

* docs

* docs a2a

* docs a2a

* add new img

* docs agent permissions
2025-12-04 17:29:47 -08:00
Ishaan Jaff
575e769bff [Feat] UI - Agent Gateway - set allowed agents by key, team (#17511)
* init schema.prisma

* init LiteLLM_ObjectPermissionTable with agents and agent_access_groups

* TestAgentRequestHandler

* refatctor agent list

* add AgentRequestHandler

* fix agent access controls by key/team

* feat - new migration for LiteLLM_AgentsTable

* fix add LiteLLM_ObjectPermissionBase with agent and agent groups

* add agent routes to llm api routes

* add agent routes as llm route

* add AgentPermissionsProps

* add agents on team/key create

* add agent selector on team/key

* add agent selector on key edit /info

* add AgentPermissions

* docs list + invoke agents
2025-12-04 16:31:17 -08:00
Raghav Jhavar
72eb4c3a1c 🆕 feat: support routing to only websearch supported deployments (#17500)
* support routing to only websearch supported deployments

* add docs
2025-12-04 14:18:20 -08:00
Krrish Dholakia
5aeba81538 docs(multi_tenant_architecture.md): add new architecture doc 2025-12-04 11:13:50 -08:00
Sameer Kankute
f2c0029939 Merge pull request #17470 from BerriAI/litellm_batches_bedrock_content
Add support for file content download for bedrock batches
2025-12-04 21:57:04 +05:30
Sameer Kankute
5b4542304d Merge pull request #17461 from BerriAI/litellm_qwen2_imported_model_support
Add support for bedrock qwen 2 imported model
2025-12-04 21:56:22 +05:30
Sameer Kankute
edd392b50d Add support for file content download for bedrock batches 2025-12-04 13:27:53 +05:30
Krish Dholakia
dc7c2b9b05 Update docs to link agent hub (#17462)
* Docs: Add AI Hub agent registry documentation

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Fix: Update AI Hub link in A2A documentation

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-03 21:59:45 -08:00
Sameer Kankute
4710e772be Add support for bedrock qwen 2 imported model 2025-12-04 11:08:57 +05:30
codgician
adfbb1c308 docs: document responses and embedding api for github copilot (#17456) 2025-12-03 21:22:08 -08:00
Krish Dholakia
32013f63a0 Guardrail API - support tool call checks on OpenAI /chat/completions, OpenAI /responses, Anthropic /v1/messages (#17459)
* fix(unified_guardrail.py): correctly map a v1/messages call to the anthropic unified guardrail

* fix: add more rigorous call type checks

* fix(anthropic_endpoints/endpoints.py): initialize logging object at the beginning of endpoint

ensures call id + trace id are emitted to guardrail api

* feat(anthropic/chat/guardrail_translation): support streaming guardrails

sample on every 5 chunks

* fix(openai/chat/guardrail_translation): support openai streaming guardrails

* fix: initial commit fixing output guardrails for responses api

* feat(openai/responses/guardrail_translation): handler.py - fix output checks on responses api

* fix(openai/responses/guardrail_translation/handler.py): ensure responses api guardrails work on streaming

* test: update tests

* test: update tests

* fix: support multiple kinds of input to the guardrail api

* feat(guardrail_translation/handler.py): support extracting tool calls from openai chat completions for guardrail api's

* feat(generic_guardrail_api.py): support extracting + returning modified tool calls on generic_guardrails_api

allows guardrail api to analyze tool call being sent to provider - to run any analysis on it

* fix(guardrails.py): support anthropic /v1/messages tool calls

* feat(responses_api/): extract tool calls for guardrail processing

* docs(generic_guardrail_api.md): document tools param support

* docs: generic_guardrail_api.md

improve documentation
2025-12-03 21:20:39 -08:00
Ishaan Jaff
e4f954b354 [Docs] Agent Gateway (#17454)
* init litellm A2a client

* simpler a2a client interface

* test a2a

* move a2a invoking tests

* test fix

* ensure a2a send message is tracked n logs

* rename tags

* add streaming handlng

* add a2a invocation

* add a2a invocation i cost calc

* test_a2a_logging_payload

* update invoke_agent_a2a

* test_invoke_agent_a2a_adds_litellm_data

* add A2a agent

* fix endpoints on A2a

* UI allow testing a2a endpoints

* add agent imgs

* add a2a as an endpoint

* add a2a

* docs a2a invoke

* docs a2a

* docs A2a invoke
2025-12-03 18:57:41 -08:00
Ishaan Jaff
f035984dd7 fix: cyberark allow setting ssl verfiy to false (#17433) 2025-12-03 18:54:31 -08:00
yuneng-jiang
37c598441f Change is_sso_configured to auto_redirect_to_sso 2025-12-03 15:48:50 -08:00
Ishaan Jaffer
9b3d8302cf docs fix stable 2025-12-03 14:12:50 -08:00
Cesar Garcia
5e791464af docs: add Microsoft Agent Lightning to projects (#17422)
Add Agent Lightning, Microsoft's open-source framework for training
AI agents with RL, APO, and SFT. Uses LiteLLM Proxy for LLM routing
and trace collection.
2025-12-03 09:07:02 -08:00
Krrish Dholakia
be5dd234bf docs: fix list 2025-12-03 08:01:26 -08:00
Sameer Kankute
8eaabb4ad7 Add vector store support for ragflow 2025-12-03 15:29:47 +05:30
Sameer Kankute
52090c3f3e Merge pull request #17350 from BerriAI/litellm_rag_chat_completion_api
Add ragflow support for chat completions API
2025-12-03 13:29:32 +05:30
Cesar Garcia
86350fe6d7 docs: add Google ADK and Harbor to projects (#17352)
Both frameworks integrate with LiteLLM:
- Google ADK uses LiteLLM for model-agnostic agent building
- Harbor uses LiteLLM for agent evaluation across providers
2025-12-02 22:27:04 -08:00
Cesar Garcia
4c6604b0da Cleanup: Remove orphan docs pages and Docusaurus template files (#17356)
* docs: update getting started page

- Add Core Functions table with link to full list
- Add Responses API section
- Add Async section with acompletion() example
- Add "Switch Providers with One Line" example
- Clarify Basic Usage supports multiple endpoints
- Update models to current versions (openai/gpt-4o, anthropic/claude-sonnet-4)
- Use provider/model format throughout
- Fix deprecated import: from openai.error -> from openai
- Keep original structure: community key, More details links, observability env vars

* Cleanup: Remove orphan docs pages and Docusaurus template files

- Remove orphan getting_started.md (not linked in sidebar)
- Remove Docusaurus template intro.md
- Remove tutorial-basics/ directory (Docusaurus template)
- Remove tutorial-extras/ directory (Docusaurus template)
2025-12-02 22:25:26 -08:00
Ali Saleh
6b5ad5d5a6 docs: Update Instructions For Phoenix Integration (#17373) 2025-12-02 22:03:54 -08:00
Sameer Kankute
a0819d6df0 Merge branch 'main' into litellm_vertex-bge-cherrypick 2025-12-03 08:37:04 +05:30
Ishaan Jaff
427074ac6e Fix: Datadog callback regression when ddtrace is installed (#17393)
* fix DD agent host logging

* docs fix

* test_datadog_agent_configuration

* test_datadog_ignores_ddtrace_agent_host
2025-12-02 17:27:50 -08:00
Ishaan Jaff
6c188c5ae2 [Feat] New model/provider - Adds support for Google Cloud Chirp3 HD on /speech (#17391)
* docs vertex tts

* place vertex ai types in file

* use VertexAITextToSpeechConfig

* use vertex_voice_dict

* refactor docs

* docs vertex ai chirp

* TestVertexAITextToSpeechConfig

* new provider vertex ai chirp3

* test_litellm_speech_vertex_ai_chirp

* add vertex_ai/chirp cost trackign
2025-12-02 15:36:23 -08:00
Ishaan Jaff
db6c6eea89 [Docs] Add guide on how to debug gateway error vs provider error (#17387)
* add error diagnosis

* docs error diagnosis
2025-12-02 14:10:00 -08:00
Cesar Garcia
81f4d863ca docs: add Azure AI Foundry documentation for Claude models (#17104)
* docs: add Azure AI Foundry documentation for Claude models

Add documentation explaining how to use Claude models (Sonnet 4.5,
Haiku 4.5, Opus 4.1) deployed on Azure AI Foundry with LiteLLM.

Azure exposes Claude using Anthropic's native API, so users can use
the existing anthropic/ provider with their Azure endpoint.

Closes #17066

* docs: Add alternative method for Azure AI Foundry using anthropic/ provider

Document that users can use anthropic/ provider with Azure endpoint
as an alternative to the dedicated azure_ai/ provider.
2025-12-02 09:08:10 -08:00
Sameer Kankute
4ac9e4c81c Merge pull request #17345 from BerriAI/litellm_fix_jwt_auth_route_issue
Add other routes in jwt auth
2025-12-02 22:21:04 +05:30
Ishaan Jaffer
fcc108b554 docs fix 2025-12-02 21:59:02 +05:30
Ishaan Jaffer
a79002c1fe docs 2025-12-02 21:59:02 +05:30
Ishaan Jaffer
b7fe25c97d docs vertex BGE 2025-12-02 21:59:02 +05:30