diff --git a/docs/my-website/docs/completion/usage.md b/docs/my-website/docs/completion/usage.md index 2a9eab941e..c388e5bfee 100644 --- a/docs/my-website/docs/completion/usage.md +++ b/docs/my-website/docs/completion/usage.md @@ -26,6 +26,7 @@ response = completion( print(response.usage) ``` +> **Note:** LiteLLM supports endpoint bridging—if a model does not natively support a requested endpoint, LiteLLM will automatically route the call to the correct supported endpoint (such as bridging `/chat/completions` to `/responses` or vice versa) based on the model's `mode`set in `model_prices_and_context_window`. ## Streaming Usage diff --git a/docs/my-website/docs/enterprise.md b/docs/my-website/docs/enterprise.md index 9101d8e375..cc3466fc10 100644 --- a/docs/my-website/docs/enterprise.md +++ b/docs/my-website/docs/enterprise.md @@ -1,6 +1,11 @@ import Image from '@theme/IdealImage'; # Enterprise + +:::info +✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise) +::: + For companies that need SSO, user management and professional support for LiteLLM Proxy :::info diff --git a/docs/my-website/docs/fine_tuning.md b/docs/my-website/docs/fine_tuning.md index f9a9297e06..f3f955cb01 100644 --- a/docs/my-website/docs/fine_tuning.md +++ b/docs/my-website/docs/fine_tuning.md @@ -13,6 +13,8 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c | Feature | Supported | Notes | |-------|-------|-------| | Supported Providers | OpenAI, Azure OpenAI, Vertex AI | - | + +#### ⚡️See an exhaustive list of supported models and providers at [models.litellm.ai](https://models.litellm.ai/) | Cost Tracking | 🟡 | [Let us know if you need this](https://github.com/BerriAI/litellm/issues) | | Logging | ✅ | Works across all logging integrations | diff --git a/docs/my-website/docs/getting_started.md b/docs/my-website/docs/getting_started.md index 15ee00a727..6b2c1fd531 100644 --- a/docs/my-website/docs/getting_started.md +++ b/docs/my-website/docs/getting_started.md @@ -32,7 +32,8 @@ Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./ More details 👉 - [Completion() function details](./completion/) -- [All supported models / providers on LiteLLM](./providers/) +- [Overview of supported models / providers on LiteLLM](./providers/) +- [Search all models / providers](https://models.litellm.ai/) - [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main) ## streaming diff --git a/docs/my-website/docs/image_edits.md b/docs/my-website/docs/image_edits.md index 246e1c70f0..84dddd5e4a 100644 --- a/docs/my-website/docs/image_edits.md +++ b/docs/my-website/docs/image_edits.md @@ -18,6 +18,9 @@ LiteLLM provides image editing functionality that maps to OpenAI's `/images/edit | Supported LiteLLM Proxy Versions | 1.71.1+ | | | Supported LLM providers | **OpenAI** | Currently only `openai` is supported | + #### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/) + + ## Usage ### LiteLLM Python SDK diff --git a/docs/my-website/docs/image_generation.md b/docs/my-website/docs/image_generation.md index 7e7ff9922d..8cd5803aa6 100644 --- a/docs/my-website/docs/image_generation.md +++ b/docs/my-website/docs/image_generation.md @@ -279,6 +279,8 @@ print(f"response: {response}") ## Supported Providers +#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/) + | Provider | Documentation Link | |----------|-------------------| | OpenAI | [OpenAI Image Generation →](./providers/openai) | diff --git a/docs/my-website/docs/index.md b/docs/my-website/docs/index.md index 3f5e1b479c..11d2963b7a 100644 --- a/docs/my-website/docs/index.md +++ b/docs/my-website/docs/index.md @@ -524,6 +524,15 @@ try: except OpenAIError as e: print(e) ``` +### See How LiteLLM Transforms Your Requests + +Want to understand how LiteLLM parses and normalizes your LLM API requests? Use the `/utils/transform_request` endpoint to see exactly how your request is transformed internally. + +You can try it out now directly on our Demo App! +Go to the [LiteLLM API docs for transform_request](https://litellm-api.up.railway.app/#/llm%20utils/transform_request_utils_transform_request_post) + +LiteLLM will show you the normalized, provider-agnostic version of your request. This is useful for debugging, learning, and understanding how LiteLLM handles different providers and options. + ### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, Helicone, Promptlayer, Traceloop, Slack diff --git a/docs/my-website/docs/moderation.md b/docs/my-website/docs/moderation.md index 95fe8b2856..f9c2810bc8 100644 --- a/docs/my-website/docs/moderation.md +++ b/docs/my-website/docs/moderation.md @@ -130,6 +130,8 @@ Here's the exact json output and type you can expect from all moderation calls: ## **Supported Providers** +#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/) + | Provider | |-------------| | OpenAI | diff --git a/docs/my-website/docs/observability/callbacks.md b/docs/my-website/docs/observability/callbacks.md index 040d83697d..b752bdc276 100644 --- a/docs/my-website/docs/observability/callbacks.md +++ b/docs/my-website/docs/observability/callbacks.md @@ -5,13 +5,15 @@ liteLLM provides `input_callbacks`, `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses. :::tip -**New to LiteLLM Callbacks?** Check out our comprehensive [Callback Management Guide](./callback_management.md) to understand when to use different callback hooks like `async_log_success_event` vs `async_post_call_success_hook`. +**New to LiteLLM Callbacks?** + +- For proxy/server logging and observability, see the [Proxy Logging Guide](https://docs.litellm.ai/docs/proxy/logging). +- To write your own callback logic, see the [Custom Callbacks Guide](https://docs.litellm.ai/docs/observability/custom_callback). ::: -liteLLM supports: -- [Custom Callback Functions](https://docs.litellm.ai/docs/observability/custom_callback) -- [Callback Management Guide](./callback_management.md) - **Comprehensive guide for choosing the right hooks** +### Supported Callback Integrations + - [Lunary](https://lunary.ai/docs) - [Langfuse](https://langfuse.com/docs) - [LangSmith](https://www.langchain.com/langsmith) @@ -21,9 +23,20 @@ liteLLM supports: - [Sentry](https://docs.sentry.io/platforms/python/) - [PostHog](https://posthog.com/docs/libraries/python) - [Slack](https://slack.dev/bolt-python/concepts) +- [Arize](https://docs.arize.com/) +- [PromptLayer](https://docs.promptlayer.com/) This is **not** an extensive list. Please check the dropdown for all logging integrations. +### Related Cookbooks +Try out our cookbooks for code snippets and interactive demos: + +- [Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Langfuse.ipynb) +- [Lunary Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Lunary.ipynb) +- [Arize Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Arize.ipynb) +- [Proxy + Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Proxy_Langfuse.ipynb) +- [PromptLayer Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/LiteLLM_PromptLayer.ipynb) + ### Quick Start ```python diff --git a/docs/my-website/docs/observability/custom_callback.md b/docs/my-website/docs/observability/custom_callback.md index c206c23d0f..cfe97ca42c 100644 --- a/docs/my-website/docs/observability/custom_callback.md +++ b/docs/my-website/docs/observability/custom_callback.md @@ -67,6 +67,23 @@ asyncio.run(completion()) - `async_post_call_success_hook` - Access user data + modify responses - `async_pre_call_hook` - Modify requests before sending +### Example: Modifying the Response in async_post_call_success_hook + +You can use `async_post_call_success_hook` to add custom headers or metadata to the response before it is returned to the client. For example: + +```python +async def async_post_call_success_hook(data, user_api_key_dict, response): + # Add a custom header to the response + additional_headers = getattr(response, "_hidden_params", {}).get("additional_headers", {}) or {} + additional_headers["x-litellm-custom-header"] = "my-value" + if not hasattr(response, "_hidden_params"): + response._hidden_params = {} + response._hidden_params["additional_headers"] = additional_headers + return response +``` + +This allows you to inject custom metadata or headers into the response for downstream consumers. You can use this pattern to pass information to clients, proxies, or observability tools. + ## Callback Functions If you just want to log on a specific event (e.g. on input) - you can use callback functions. diff --git a/docs/my-website/docs/providers/bedrock.md b/docs/my-website/docs/providers/bedrock.md index 86e9ac5e3e..fe99609914 100644 --- a/docs/my-website/docs/providers/bedrock.md +++ b/docs/my-website/docs/providers/bedrock.md @@ -2340,6 +2340,39 @@ response = completion( Make the bedrock completion call +--- + +### Required AWS IAM Policy for AssumeRole + +To use `aws_role_name` (STS AssumeRole) with LiteLLM, your IAM user or role **must** have permission to call `sts:AssumeRole` on the target role. If you see an error like: + +``` +An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::...:assumed-role/litellm-ecs-task-role/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::...:role/Enterprise/BedrockCrossAccountConsumer +``` + +This means the IAM identity running LiteLLM does **not** have permission to assume the target role. You must update your IAM policy to allow this action. + +#### Example IAM Policy + +Replace `` with the ARN of the role you want to assume (e.g., `arn:aws:iam::123456789012:role/Enterprise/BedrockCrossAccountConsumer`). + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "sts:AssumeRole", + "Resource": "" + } + ] +} +``` + +**Note:** The target role itself must also trust the calling IAM identity (via its trust policy) for AssumeRole to succeed. See [AWS AssumeRole docs](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-api.html) for more details. + +--- + diff --git a/docs/my-website/docs/providers/vertex.md b/docs/my-website/docs/providers/vertex.md index 3f4d106895..78b6abd84a 100644 --- a/docs/my-website/docs/providers/vertex.md +++ b/docs/my-website/docs/providers/vertex.md @@ -196,6 +196,19 @@ model_list: vertex_location: "us-central1" vertex_credentials: "/path/to/service_account.json" # [OPTIONAL] Do this OR `!gcloud auth application-default login` - run this to add vertex credentials to your env ``` +or +```yaml +model_list: + - model_name: gemini-pro + litellm_params: + model: vertex_ai/gemini-1.5-pro + litellm_credential_name: vertex-global + vertex_project: project-name-here + vertex_location: global + base_model: gemini + model_info: + provider: Vertex +``` 2. Start Proxy diff --git a/docs/my-website/docs/proxy/caching.md b/docs/my-website/docs/proxy/caching.md index 1fb7385f68..617609cf08 100644 --- a/docs/my-website/docs/proxy/caching.md +++ b/docs/my-website/docs/proxy/caching.md @@ -958,6 +958,19 @@ curl http://localhost:4000/v1/chat/completions \ + +## Redis max_connections + +You can set the `max_connections` parameter in your `cache_params` for Redis. This is passed directly to the Redis client and controls the maximum number of simultaneous connections in the pool. If you see errors like `No connection available`, try increasing this value: + +```yaml +litellm_settings: + cache: true + cache_params: + type: redis + max_connections: 100 +``` + ## Supported `cache_params` on proxy config.yaml ```yaml @@ -966,6 +979,7 @@ cache_params: ttl: Optional[float] default_in_memory_ttl: Optional[float] default_in_redis_ttl: Optional[float] + max_connections: Optional[Int] # Type of cache (options: "local", "redis", "s3") type: s3 diff --git a/docs/my-website/docs/proxy/config_settings.md b/docs/my-website/docs/proxy/config_settings.md index 974e95a07b..f70701886b 100644 --- a/docs/my-website/docs/proxy/config_settings.md +++ b/docs/my-website/docs/proxy/config_settings.md @@ -50,6 +50,7 @@ litellm_settings: port: 6379 # The port number for the Redis cache. Required if type is "redis". password: "your_password" # The password for the Redis cache. Required if type is "redis". namespace: "litellm.caching.caching" # namespace for redis cache + max_connections: 100 # [OPTIONAL] Set Maximum number of Redis connections. Passed directly to redis-py. # Optional - Redis Cluster Settings redis_startup_nodes: [{"host": "127.0.0.1", "port": "7001"}] diff --git a/docs/my-website/docs/proxy/custom_sso.md b/docs/my-website/docs/proxy/custom_sso.md index 8e869a1139..bbd7f41bee 100644 --- a/docs/my-website/docs/proxy/custom_sso.md +++ b/docs/my-website/docs/proxy/custom_sso.md @@ -1,9 +1,7 @@ # ✨ Event Hooks for SSO Login :::info - -✨ This is an Enterprise only feature [Get Started with Enterprise here](https://www.litellm.ai/enterprise) - +✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise) ::: ## Overview diff --git a/docs/my-website/docs/proxy/db_deadlocks.md b/docs/my-website/docs/proxy/db_deadlocks.md index 0eee928fa6..ef9d31d623 100644 --- a/docs/my-website/docs/proxy/db_deadlocks.md +++ b/docs/my-website/docs/proxy/db_deadlocks.md @@ -84,3 +84,29 @@ LiteLLM emits the following prometheus metrics to monitor the health/status of t | `litellm_in_memory_spend_update_queue_size` | In-memory aggregate spend values for keys, users, teams, team members, etc.| In-Memory | | `litellm_redis_spend_update_queue_size` | Redis aggregate spend values for keys, users, teams, etc. | Redis | + +## Troubleshooting: Redis Connection Errors + +You may see errors like: + +``` +LiteLLM Redis Caching: async async_increment() - Got exception from REDIS No connection available., Writing value=21 +LiteLLM Redis Caching: async set_cache_pipeline() - Got exception from REDIS No connection available., Writing value=None +``` + +This means all available Redis connections are in use, and LiteLLM cannot obtain a new connection from the pool. This can happen under high load or with many concurrent proxy requests. + +**Solution:** + +- Increase the `max_connections` parameter in your Redis config section in `proxy_config.yaml` to allow more simultaneous connections. For example: + +```yaml +litellm_settings: + cache: True + cache_params: + type: redis + max_connections: 100 # Increase as needed for your traffic +``` + +Adjust this value based on your expected concurrency and Redis server capacity. + diff --git a/docs/my-website/docs/proxy/guardrails/bedrock.md b/docs/my-website/docs/proxy/guardrails/bedrock.md index 6725acf1f2..4a1a0a246f 100644 --- a/docs/my-website/docs/proxy/guardrails/bedrock.md +++ b/docs/my-website/docs/proxy/guardrails/bedrock.md @@ -4,6 +4,10 @@ import TabItem from '@theme/TabItem'; # Bedrock Guardrails +:::tip ⚡️ +If you haven't set up or authenticated your Bedrock provider yet, see the [Bedrock Provider Setup & Authentication Guide](../../providers/bedrock.md). +::: + LiteLLM supports Bedrock guardrails via the [Bedrock ApplyGuardrail API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ApplyGuardrail.html). ## Quick Start diff --git a/docs/my-website/docs/proxy/load_balancing.md b/docs/my-website/docs/proxy/load_balancing.md index bcbc4e9365..54c917bbbc 100644 --- a/docs/my-website/docs/proxy/load_balancing.md +++ b/docs/my-website/docs/proxy/load_balancing.md @@ -172,6 +172,9 @@ router_settings: redis_host: redis_password: redis_port: 1992 + cache_params: + type: redis + max_connections: 100 # maximum Redis connections in the pool; tune based on expected concurrency/load ``` ## Router settings on config - routing_strategy, model_group_alias diff --git a/docs/my-website/docs/proxy/self_serve.md b/docs/my-website/docs/proxy/self_serve.md index dff55a8ac0..b54344c1d0 100644 --- a/docs/my-website/docs/proxy/self_serve.md +++ b/docs/my-website/docs/proxy/self_serve.md @@ -227,7 +227,7 @@ export PROXY_LOGOUT_URL="https://www.google.com" -### Set max budget for internal users +### Set default max budget for internal users Automatically apply budget per internal user when they sign up. By default the table will be checked every 10 minutes, for users to reset. To modify this, [see this](./users.md#reset-budgets) @@ -239,6 +239,10 @@ litellm_settings: This sets a max budget of $10 USD for internal users when they sign up. +You can also manage these settings visually in the UI: + + + This budget only applies to personal keys created by that user - seen under `Default Team` on the UI. diff --git a/docs/my-website/docs/proxy_api.md b/docs/my-website/docs/proxy_api.md index 89bfacbe19..7612645fb5 100644 --- a/docs/my-website/docs/proxy_api.md +++ b/docs/my-website/docs/proxy_api.md @@ -27,7 +27,7 @@ Email us @ krrish@berri.ai ## Supported Models for LiteLLM Key These are the models that currently work with the "sk-litellm-.." keys. -For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/) +For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/) or check out [models.litellm.ai](https://models.litellm.ai/) * OpenAI models - [OpenAI docs](./providers/openai.md) * gpt-4 diff --git a/docs/my-website/docs/rerank.md b/docs/my-website/docs/rerank.md index c57eacbb22..cad6471838 100644 --- a/docs/my-website/docs/rerank.md +++ b/docs/my-website/docs/rerank.md @@ -109,6 +109,8 @@ curl http://0.0.0.0:4000/rerank \ ## **Supported Providers** +#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/) + | Provider | Link to Usage | |-------------|--------------------| | Cohere (v1 + v2 clients) | [Usage](#quick-start) | diff --git a/docs/my-website/docs/response_api.md b/docs/my-website/docs/response_api.md index 3e9d6683c2..8bb10bbe36 100644 --- a/docs/my-website/docs/response_api.md +++ b/docs/my-website/docs/response_api.md @@ -3,8 +3,11 @@ import TabItem from '@theme/TabItem'; # /responses [Beta] + LiteLLM provides a BETA endpoint in the spec of [OpenAI's `/responses` API](https://platform.openai.com/docs/api-reference/responses) +Requests to /chat/completions may be bridged here automatically when the provider lacks support for that endpoint. The model’s default `mode` determines how bridging works.(see `model_prices_and_context_window`) + | Feature | Supported | Notes | |---------|-----------|--------| | Cost Tracking | ✅ | Works with all supported models | @@ -78,6 +81,43 @@ print(retrieved_response) # retrieved_response = await litellm.aget_responses(response_id=response_id) ``` +#### CANCEL a Response +You can cancel an in-progress response (if supported by the provider): + +```python showLineNumbers title="Cancel Response by ID" +import litellm + +# First, create a response +response = litellm.responses( + model="openai/o1-pro", + input="Tell me a three sentence bedtime story about a unicorn.", + max_output_tokens=100 +) + +# Get the response ID +response_id = response.id + +# Cancel the response by ID +cancel_response = litellm.cancel_responses( + response_id=response_id +) + +print(cancel_response) + +# For async usage +# cancel_response = await litellm.acancel_responses(response_id=response_id) +``` + + +**REST API:** +```bash +curl -X POST http://localhost:4000/v1/responses/response_id/cancel \ + -H "Authorization: Bearer sk-1234" +``` + +This will attempt to cancel the in-progress response with the given ID. +**Note:** Not all providers support response cancellation. If unsupported, an error will be raised. + #### DELETE a Response ```python showLineNumbers title="Delete Response by ID" import litellm diff --git a/docs/my-website/img/default_user_settings_admin_ui.png b/docs/my-website/img/default_user_settings_admin_ui.png new file mode 100644 index 0000000000..5910154cd5 Binary files /dev/null and b/docs/my-website/img/default_user_settings_admin_ui.png differ diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index a131e5c34e..3750915b19 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -57,32 +57,31 @@ const sidebars = { type: "category", label: "Alerting & Monitoring", items: [ - "proxy/prometheus", "proxy/alerting", - "proxy/pagerduty" - ].sort() + "proxy/pagerduty", + "proxy/prometheus" + ] }, { type: "category", label: "[Beta] Prompt Management", items: [ - "proxy/prompt_management", + "proxy/custom_prompt_management", "proxy/native_litellm_prompt", - "proxy/custom_prompt_management" - ].sort() + "proxy/prompt_management" + ] }, { type: "category", label: "AI Tools (OpenWebUI, Claude Code, etc.)", items: [ - "integrations/letta", - "tutorials/openweb_ui", - "tutorials/openai_codex", - "tutorials/litellm_gemini_cli", - "tutorials/litellm_qwen_code_cli", - "tutorials/github_copilot_integration", "tutorials/claude_responses_api", "tutorials/cost_tracking_coding", + "tutorials/github_copilot_integration", + "tutorials/litellm_gemini_cli", + "tutorials/litellm_qwen_code_cli", + "tutorials/openai_codex", + "tutorials/openweb_ui" ] }, @@ -112,29 +111,115 @@ const sidebars = { label: "Setup & Deployment", items: [ "proxy/quick_start", - "proxy/user_onboarding", - "proxy/deploy", - "proxy/prod", "proxy/cli", - "proxy/release_cycle", - "proxy/model_management", - "proxy/health", "proxy/debugging", + "proxy/deploy", + "proxy/health", "proxy/master_key_rotations", + "proxy/model_management", + "proxy/prod", + "proxy/release_cycle", ], }, "proxy/demo", + { + type: "category", + label: "Admin UI", + items: [ + "proxy/admin_ui_sso", + "proxy/custom_root_ui", + "proxy/custom_sso", + "proxy/model_hub", + "proxy/public_teams", + "proxy/self_serve", + "proxy/ui", + "proxy/ui/bulk_edit_users", + "proxy/ui_credentials", + "tutorials/scim_litellm", + { + type: "category", + label: "UI Logs", + items: [ + "proxy/ui_logs", + "proxy/ui_logs_sessions" + ] + } + ], + }, { type: "category", label: "Architecture", - items: ["proxy/architecture", "proxy/control_plane_and_data_plane", "proxy/db_info", "proxy/db_deadlocks", "router_architecture", "proxy/user_management_heirarchy", "proxy/jwt_auth_arch", "proxy/image_handling", "proxy/spend_logs_deletion"], + items: [ + "proxy/architecture", + "proxy/control_plane_and_data_plane", + "proxy/db_deadlocks", + "proxy/db_info", + "proxy/image_handling", + "proxy/jwt_auth_arch", + "proxy/spend_logs_deletion", + "proxy/user_management_heirarchy", + "router_architecture" + ], }, { type: "link", label: "All Endpoints (Swagger)", href: "https://litellm-api.up.railway.app/", }, - "proxy/management_cli", + "proxy/enterprise", + "proxy/management_cli", + { + type: "category", + label: "Authentication", + items: [ + "proxy/virtual_keys", + "proxy/token_auth", + "proxy/service_accounts", + "proxy/access_control", + "proxy/cli_sso", + "proxy/custom_auth", + "proxy/ip_address", + "proxy/email", + "proxy/multiple_admins", + ], + }, + { + type: "category", + label: "Budgets + Rate Limits", + items: [ + "proxy/customers", + "proxy/dynamic_rate_limit", + "proxy/rate_limit_tiers", + "proxy/team_budgets", + "proxy/temporary_budget_increase", + "proxy/users" + ], + }, + "proxy/caching", + { + type: "category", + label: "Create Custom Plugins", + description: "Modify requests, responses, and more", + items: [ + "proxy/call_hooks", + "proxy/rules", + ] + }, + { + type: "link", + label: "Load Balancing, Routing, Fallbacks", + href: "https://docs.litellm.ai/docs/routing-load-balancing", + }, + { + type: "category", + label: "Logging, Alerting, Metrics", + items: [ + "proxy/dynamic_logging", + "proxy/logging", + "proxy/logging_spec", + "proxy/team_logging" + ], + }, { type: "category", label: "Making LLM Requests", @@ -147,19 +232,6 @@ const sidebars = { "proxy/model_discovery", ], }, - { - type: "category", - label: "Authentication", - items: [ - "proxy/virtual_keys", - "proxy/token_auth", - "proxy/service_accounts", - "proxy/access_control", - "proxy/ip_address", - "proxy/email", - "proxy/custom_auth", - ], - }, { type: "category", label: "Model Access", @@ -168,73 +240,6 @@ const sidebars = { "proxy/team_model_add" ] }, - { - type: "category", - label: "Spend Tracking", - items: ["proxy/cost_tracking", "proxy/custom_pricing", "proxy/billing",], - }, - { - type: "category", - label: "Budgets + Rate Limits", - items: ["proxy/users", "proxy/temporary_budget_increase", "proxy/rate_limit_tiers", "proxy/team_budgets", "proxy/dynamic_rate_limit", "proxy/customers"], - }, - { - type: "category", - label: "Enterprise Features", - items: [ - "proxy/enterprise", - { - type: "category", - label: "Admin UI", - items: [ - "proxy/ui", - "proxy/admin_ui_sso", - "proxy/custom_root_ui", - "proxy/model_hub", - "proxy/self_serve", - "proxy/public_teams", - "proxy/ui_credentials", - "proxy/ui/bulk_edit_users", - { - type: "category", - label: "UI Logs", - items: [ - "proxy/ui_logs", - "proxy/ui_logs_sessions" - ] - } - ], - }, - { - type: "category", - label: "SSO & Identity Management", - items: [ - "proxy/cli_sso", - "proxy/admin_ui_sso", - "proxy/custom_sso", - "tutorials/scim_litellm", - "tutorials/msft_sso", - "proxy/multiple_admins", - ], - }, - ], - }, - { - type: "link", - label: "Load Balancing, Routing, Fallbacks", - href: "https://docs.litellm.ai/docs/routing-load-balancing", - }, - { - type: "category", - label: "Logging, Alerting, Metrics", - items: [ - "proxy/logging", - "proxy/logging_spec", - "proxy/team_logging", - "proxy/dynamic_logging" - ], - }, - { type: "category", label: "Secret Managers", @@ -245,14 +250,13 @@ const sidebars = { }, { type: "category", - label: "Create Custom Plugins", - description: "Modify requests, responses, and more", + label: "Spend Tracking", items: [ - "proxy/call_hooks", - "proxy/rules", - ] + "proxy/billing", + "proxy/cost_tracking", + "proxy/custom_pricing" + ], }, - "proxy/caching", ] }, { @@ -266,13 +270,11 @@ const sidebars = { slug: "/supported_endpoints", }, items: [ - "anthropic_unified", - "apply_guardrail", "assistants", { type: "category", label: "/audio", - "items": [ + items: [ "audio_transcription", "text_to_speech", ] @@ -301,6 +303,7 @@ const sidebars = { "completion/http_handler_config", ], }, + "text_completion", "embedding/supported_embedding", { type: "category", @@ -318,13 +321,14 @@ const sidebars = { "proxy/managed_finetuning", ] }, - "generateContent", + "generateContent", + "apply_guardrail", { type: "category", label: "/images", items: [ - "image_generation", "image_edits", + "image_generation", "image_variations", ] }, @@ -335,23 +339,23 @@ const sidebars = { label: "Pass-through Endpoints (Anthropic SDK, etc.)", items: [ "pass_through/intro", - "pass_through/vertex_ai", - "pass_through/google_ai_studio", + "pass_through/anthropic_completion", + "pass_through/assembly_ai", + "pass_through/bedrock", "pass_through/cohere", - "pass_through/vllm", + "pass_through/google_ai_studio", + "pass_through/langfuse", "pass_through/mistral", "pass_through/openai_passthrough", - "pass_through/anthropic_completion", - "pass_through/bedrock", - "pass_through/assembly_ai", - "pass_through/langfuse", - "proxy/pass_through", - ], + "pass_through/vertex_ai", + "pass_through/vllm", + "proxy/pass_through" + ] }, "realtime", "rerank", "response_api", - "text_completion", + "anthropic_unified", { type: "category", label: "/vector_stores", @@ -398,7 +402,6 @@ const sidebars = { items: [ "providers/azure_ai", "providers/azure_ai_img", - "providers/azure_ai_img_edit", ] }, { @@ -515,33 +518,32 @@ const sidebars = { type: "category", label: "Guides", items: [ - "exception_mapping", + "completion/audio", + "completion/batching", + "completion/computer_use", + "completion/document_understanding", + "completion/drop_params", + "completion/function_call", + "completion/image_generation_chat", + "completion/json_mode", + "completion/knowledgebase", + "completion/message_trimming", + "completion/model_alias", + "completion/mock_requests", + "completion/predict_outputs", + "completion/prefix", + "completion/prompt_caching", + "completion/prompt_formatting", + "completion/reliable_completions", + "completion/stream", "completion/provider_specific_params", + "completion/vision", + "completion/web_search", + "exception_mapping", "guides/finetuned_models", "guides/security_settings", - "completion/audio", - "completion/image_generation_chat", - "completion/web_search", - "completion/document_understanding", - "completion/vision", - "completion/json_mode", - "reasoning_content", - "completion/computer_use", - "completion/prompt_caching", - "completion/predict_outputs", - "completion/knowledgebase", - "completion/prefix", - "completion/drop_params", - "completion/prompt_formatting", - "completion/stream", - "completion/message_trimming", - "completion/function_call", - "completion/model_alias", - "completion/batching", - "completion/mock_requests", - "completion/reliable_completions", "proxy/veo_video_generation", - + "reasoning_content" ] }, @@ -554,26 +556,35 @@ const sidebars = { description: "Learn how to load balance, route, and set fallbacks for your LLM requests", slug: "/routing-load-balancing", }, - items: ["routing", "scheduler", "proxy/load_balancing", "proxy/reliability", "proxy/timeout", "proxy/auto_routing", "proxy/tag_routing", "proxy/provider_budget_routing", "wildcard_routing"], + items: [ + "routing", + "scheduler", + "proxy/auto_routing", + "proxy/load_balancing", + "proxy/provider_budget_routing", + "proxy/reliability", + "proxy/tag_routing", + "proxy/timeout", + "wildcard_routing" + ], }, { type: "category", label: "LiteLLM Python SDK", items: [ "set_keys", - "completion/token_usage", - "sdk/headers", - "sdk_custom_pricing", - "embedding/async_embedding", - "embedding/moderation", "budget_manager", "caching/all_caches", + "completion/token_usage", + "embedding/async_embedding", + "embedding/moderation", "migration", + "sdk_custom_pricing", { type: "category", label: "LangChain, LlamaIndex, Instructor Integration", items: ["langchain/langchain", "tutorials/instructor"], - }, + } ], }, diff --git a/security.md b/security.md index d126dabcc6..2da073661c 100644 --- a/security.md +++ b/security.md @@ -12,6 +12,11 @@ - For installation and configuration, see: [Self-hosting guided](https://docs.litellm.ai/docs/proxy/deploy) - **Telemetry** We run no telemetry when you self host LiteLLM + +:::info +✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise) +::: + ### LiteLLM Cloud - We encrypt all data stored using your `LITELLM_MASTER_KEY` and in transit using TLS.