Corrected docs updates sept 2025 (#14916)

* docs: Corrected documentation updates from Sept 2025

This PR contains the actual intended documentation changes, properly synced with main:

 Real changes applied:
- Added AWS authentication link to bedrock guardrails documentation
- Updated Vertex AI with Gemini API alternative configuration
- Added async_post_call_success_hook code snippet to custom callback docs
- Added SSO free for up to 5 users information to enterprise and custom_sso docs
- Added SSO free information block to security.md
- Added cancel response API usage and curl example to response_api.md
- Added image for modifying default user budget via admin UI
- Re-ordered sidebars in documentation

 Sync issues resolved:
- Kept all upstream changes that were added to main after branch diverged
- Preserved Provider-Specific Metadata Parameters section that was added upstream
- Maintained proper curl parameter formatting (-d instead of -D)

This corrects the sync issues from the original PR #14769.

* docs: Restore missing files from original PR

Added back ~16 missing documentation files that were part of the original PR:

 Restored files:
- docs/my-website/docs/completion/usage.md
- docs/my-website/docs/fine_tuning.md
- docs/my-website/docs/getting_started.md
- docs/my-website/docs/image_edits.md
- docs/my-website/docs/image_generation.md
- docs/my-website/docs/index.md
- docs/my-website/docs/moderation.md
- docs/my-website/docs/observability/callbacks.md
- docs/my-website/docs/providers/bedrock.md
- docs/my-website/docs/proxy/caching.md
- docs/my-website/docs/proxy/config_settings.md
- docs/my-website/docs/proxy/db_deadlocks.md
- docs/my-website/docs/proxy/load_balancing.md
- docs/my-website/docs/proxy_api.md
- docs/my-website/docs/rerank.md

 Fixed context-caching issue:
- Restored provider_specific_params.md to main version (preserving Provider-Specific Metadata Parameters section)
- Your original PR didn't intend to modify this file - it was just a sync issue

Now includes all ~26 documentation files from the original PR #14769.

* docs: Remove files that were deleted in original PR

- Removed docs/my-website/docs/providers/azure_ai_img_edit.md (was deleted in original PR)
- sdk/headers.md was already not present

Now matches the complete intended changes from original PR #14769.

* docs: Restore azure_ai_img_edit.md from main

- Restored docs/my-website/docs/providers/azure_ai_img_edit.md from main branch
- This file should not have been deleted as it was a newer commit
- SDK headers file doesn't exist in main (was reverted) and wasn't part of your original changes

Fixes the file restoration issues.

* docs: Fix vertex.md - preserve context caching from newer commit

- Restored vertex.md to main version to preserve context caching content (lines 817-887)
- Added back only your intended change: alternative gemini config example
- Context caching content from newer commit is now preserved

Fixes the vertex.md sync issue where newer content was incorrectly deleted.

* docs: Fix providers/bedrock.md - restore deleted content from newer commit

- Restored providers/bedrock.md to main version
- Preserves 'Usage - Request Metadata' section that was added in newer commit
- Your actual intended change was to proxy/guardrails/bedrock.md (authentication tip) which is preserved
- Now only has additions, no subtractions as intended

Fixes the bedrock.md sync issue.

* docs: Restore missing IAM policy section in bedrock.md

Added back your intended IAM policy documentation that was lost when restoring main version:

 Added IAM AssumeRole Policy section:
- Explains requirement for sts:AssumeRole permission
- Shows error message example when permission missing
- Provides complete IAM policy JSON example
- Links to AWS AssumeRole documentation
- Clarifies trust policy requirements

Now bedrock.md has both:
- All newer content preserved (Request Metadata section)
- Your intended IAM policy addition restored

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
This commit is contained in:
Teddy Amkie
2025-09-25 15:49:19 -07:00
committed by GitHub
parent 2dd38420a7
commit dcbccd1fea
25 changed files with 371 additions and 162 deletions

View File

@@ -26,6 +26,7 @@ response = completion(
print(response.usage)
```
> **Note:** LiteLLM supports endpoint bridging—if a model does not natively support a requested endpoint, LiteLLM will automatically route the call to the correct supported endpoint (such as bridging `/chat/completions` to `/responses` or vice versa) based on the model's `mode`set in `model_prices_and_context_window`.
## Streaming Usage

View File

@@ -1,6 +1,11 @@
import Image from '@theme/IdealImage';
# Enterprise
:::info
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
:::
For companies that need SSO, user management and professional support for LiteLLM Proxy
:::info

View File

@@ -13,6 +13,8 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c
| Feature | Supported | Notes |
|-------|-------|-------|
| Supported Providers | OpenAI, Azure OpenAI, Vertex AI | - |
#### ⚡See an exhaustive list of supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
| Cost Tracking | 🟡 | [Let us know if you need this](https://github.com/BerriAI/litellm/issues) |
| Logging | ✅ | Works across all logging integrations |

View File

@@ -32,7 +32,8 @@ Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./
More details 👉
- [Completion() function details](./completion/)
- [All supported models / providers on LiteLLM](./providers/)
- [Overview of supported models / providers on LiteLLM](./providers/)
- [Search all models / providers](https://models.litellm.ai/)
- [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main)
## streaming

View File

@@ -18,6 +18,9 @@ LiteLLM provides image editing functionality that maps to OpenAI's `/images/edit
| Supported LiteLLM Proxy Versions | 1.71.1+ | |
| Supported LLM providers | **OpenAI** | Currently only `openai` is supported |
#### ⚡See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
## Usage
### LiteLLM Python SDK

View File

@@ -279,6 +279,8 @@ print(f"response: {response}")
## Supported Providers
#### ⚡See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
| Provider | Documentation Link |
|----------|-------------------|
| OpenAI | [OpenAI Image Generation →](./providers/openai) |

View File

@@ -524,6 +524,15 @@ try:
except OpenAIError as e:
print(e)
```
### See How LiteLLM Transforms Your Requests
Want to understand how LiteLLM parses and normalizes your LLM API requests? Use the `/utils/transform_request` endpoint to see exactly how your request is transformed internally.
You can try it out now directly on our Demo App!
Go to the [LiteLLM API docs for transform_request](https://litellm-api.up.railway.app/#/llm%20utils/transform_request_utils_transform_request_post)
LiteLLM will show you the normalized, provider-agnostic version of your request. This is useful for debugging, learning, and understanding how LiteLLM handles different providers and options.
### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, Helicone, Promptlayer, Traceloop, Slack

View File

@@ -130,6 +130,8 @@ Here's the exact json output and type you can expect from all moderation calls:
## **Supported Providers**
#### ⚡See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
| Provider |
|-------------|
| OpenAI |

View File

@@ -5,13 +5,15 @@
liteLLM provides `input_callbacks`, `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
:::tip
**New to LiteLLM Callbacks?** Check out our comprehensive [Callback Management Guide](./callback_management.md) to understand when to use different callback hooks like `async_log_success_event` vs `async_post_call_success_hook`.
**New to LiteLLM Callbacks?**
- For proxy/server logging and observability, see the [Proxy Logging Guide](https://docs.litellm.ai/docs/proxy/logging).
- To write your own callback logic, see the [Custom Callbacks Guide](https://docs.litellm.ai/docs/observability/custom_callback).
:::
liteLLM supports:
- [Custom Callback Functions](https://docs.litellm.ai/docs/observability/custom_callback)
- [Callback Management Guide](./callback_management.md) - **Comprehensive guide for choosing the right hooks**
### Supported Callback Integrations
- [Lunary](https://lunary.ai/docs)
- [Langfuse](https://langfuse.com/docs)
- [LangSmith](https://www.langchain.com/langsmith)
@@ -21,9 +23,20 @@ liteLLM supports:
- [Sentry](https://docs.sentry.io/platforms/python/)
- [PostHog](https://posthog.com/docs/libraries/python)
- [Slack](https://slack.dev/bolt-python/concepts)
- [Arize](https://docs.arize.com/)
- [PromptLayer](https://docs.promptlayer.com/)
This is **not** an extensive list. Please check the dropdown for all logging integrations.
### Related Cookbooks
Try out our cookbooks for code snippets and interactive demos:
- [Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Langfuse.ipynb)
- [Lunary Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Lunary.ipynb)
- [Arize Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Arize.ipynb)
- [Proxy + Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Proxy_Langfuse.ipynb)
- [PromptLayer Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/LiteLLM_PromptLayer.ipynb)
### Quick Start
```python

View File

@@ -67,6 +67,23 @@ asyncio.run(completion())
- `async_post_call_success_hook` - Access user data + modify responses
- `async_pre_call_hook` - Modify requests before sending
### Example: Modifying the Response in async_post_call_success_hook
You can use `async_post_call_success_hook` to add custom headers or metadata to the response before it is returned to the client. For example:
```python
async def async_post_call_success_hook(data, user_api_key_dict, response):
# Add a custom header to the response
additional_headers = getattr(response, "_hidden_params", {}).get("additional_headers", {}) or {}
additional_headers["x-litellm-custom-header"] = "my-value"
if not hasattr(response, "_hidden_params"):
response._hidden_params = {}
response._hidden_params["additional_headers"] = additional_headers
return response
```
This allows you to inject custom metadata or headers into the response for downstream consumers. You can use this pattern to pass information to clients, proxies, or observability tools.
## Callback Functions
If you just want to log on a specific event (e.g. on input) - you can use callback functions.

View File

@@ -2340,6 +2340,39 @@ response = completion(
Make the bedrock completion call
---
### Required AWS IAM Policy for AssumeRole
To use `aws_role_name` (STS AssumeRole) with LiteLLM, your IAM user or role **must** have permission to call `sts:AssumeRole` on the target role. If you see an error like:
```
An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::...:assumed-role/litellm-ecs-task-role/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::...:role/Enterprise/BedrockCrossAccountConsumer
```
This means the IAM identity running LiteLLM does **not** have permission to assume the target role. You must update your IAM policy to allow this action.
#### Example IAM Policy
Replace `<TARGET_ROLE_ARN>` with the ARN of the role you want to assume (e.g., `arn:aws:iam::123456789012:role/Enterprise/BedrockCrossAccountConsumer`).
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "<TARGET_ROLE_ARN>"
}
]
}
```
**Note:** The target role itself must also trust the calling IAM identity (via its trust policy) for AssumeRole to succeed. See [AWS AssumeRole docs](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-api.html) for more details.
---
<Tabs>
<TabItem value="sdk" label="SDK">

View File

@@ -196,6 +196,19 @@ model_list:
vertex_location: "us-central1"
vertex_credentials: "/path/to/service_account.json" # [OPTIONAL] Do this OR `!gcloud auth application-default login` - run this to add vertex credentials to your env
```
or
```yaml
model_list:
- model_name: gemini-pro
litellm_params:
model: vertex_ai/gemini-1.5-pro
litellm_credential_name: vertex-global
vertex_project: project-name-here
vertex_location: global
base_model: gemini
model_info:
provider: Vertex
```
2. Start Proxy

View File

@@ -958,6 +958,19 @@ curl http://localhost:4000/v1/chat/completions \
</Tabs>
## Redis max_connections
You can set the `max_connections` parameter in your `cache_params` for Redis. This is passed directly to the Redis client and controls the maximum number of simultaneous connections in the pool. If you see errors like `No connection available`, try increasing this value:
```yaml
litellm_settings:
cache: true
cache_params:
type: redis
max_connections: 100
```
## Supported `cache_params` on proxy config.yaml
```yaml
@@ -966,6 +979,7 @@ cache_params:
ttl: Optional[float]
default_in_memory_ttl: Optional[float]
default_in_redis_ttl: Optional[float]
max_connections: Optional[Int]
# Type of cache (options: "local", "redis", "s3")
type: s3

View File

@@ -50,6 +50,7 @@ litellm_settings:
port: 6379 # The port number for the Redis cache. Required if type is "redis".
password: "your_password" # The password for the Redis cache. Required if type is "redis".
namespace: "litellm.caching.caching" # namespace for redis cache
max_connections: 100 # [OPTIONAL] Set Maximum number of Redis connections. Passed directly to redis-py.
# Optional - Redis Cluster Settings
redis_startup_nodes: [{"host": "127.0.0.1", "port": "7001"}]

View File

@@ -1,9 +1,7 @@
# ✨ Event Hooks for SSO Login
:::info
✨ This is an Enterprise only feature [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
:::
## Overview

View File

@@ -84,3 +84,29 @@ LiteLLM emits the following prometheus metrics to monitor the health/status of t
| `litellm_in_memory_spend_update_queue_size` | In-memory aggregate spend values for keys, users, teams, team members, etc.| In-Memory |
| `litellm_redis_spend_update_queue_size` | Redis aggregate spend values for keys, users, teams, etc. | Redis |
## Troubleshooting: Redis Connection Errors
You may see errors like:
```
LiteLLM Redis Caching: async async_increment() - Got exception from REDIS No connection available., Writing value=21
LiteLLM Redis Caching: async set_cache_pipeline() - Got exception from REDIS No connection available., Writing value=None
```
This means all available Redis connections are in use, and LiteLLM cannot obtain a new connection from the pool. This can happen under high load or with many concurrent proxy requests.
**Solution:**
- Increase the `max_connections` parameter in your Redis config section in `proxy_config.yaml` to allow more simultaneous connections. For example:
```yaml
litellm_settings:
cache: True
cache_params:
type: redis
max_connections: 100 # Increase as needed for your traffic
```
Adjust this value based on your expected concurrency and Redis server capacity.

View File

@@ -4,6 +4,10 @@ import TabItem from '@theme/TabItem';
# Bedrock Guardrails
:::tip ⚡️
If you haven't set up or authenticated your Bedrock provider yet, see the [Bedrock Provider Setup & Authentication Guide](../../providers/bedrock.md).
:::
LiteLLM supports Bedrock guardrails via the [Bedrock ApplyGuardrail API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ApplyGuardrail.html).
## Quick Start

View File

@@ -172,6 +172,9 @@ router_settings:
redis_host: <your redis host>
redis_password: <your redis password>
redis_port: 1992
cache_params:
type: redis
max_connections: 100 # maximum Redis connections in the pool; tune based on expected concurrency/load
```
## Router settings on config - routing_strategy, model_group_alias

View File

@@ -227,7 +227,7 @@ export PROXY_LOGOUT_URL="https://www.google.com"
<Image img={require('../../img/ui_logout.png')} style={{ width: '400px', height: 'auto' }} />
### Set max budget for internal users
### Set default max budget for internal users
Automatically apply budget per internal user when they sign up. By default the table will be checked every 10 minutes, for users to reset. To modify this, [see this](./users.md#reset-budgets)
@@ -239,6 +239,10 @@ litellm_settings:
This sets a max budget of $10 USD for internal users when they sign up.
You can also manage these settings visually in the UI:
<Image img={require('../../img/default_user_settings_admin_ui.png')} style={{ width: '700px', height: 'auto' }} />
This budget only applies to personal keys created by that user - seen under `Default Team` on the UI.
<Image img={require('../../img/max_budget_for_internal_users.png')} style={{ width: '500px', height: 'auto' }} />

View File

@@ -27,7 +27,7 @@ Email us @ krrish@berri.ai
## Supported Models for LiteLLM Key
These are the models that currently work with the "sk-litellm-.." keys.
For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/)
For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/) or check out [models.litellm.ai](https://models.litellm.ai/)
* OpenAI models - [OpenAI docs](./providers/openai.md)
* gpt-4

View File

@@ -109,6 +109,8 @@ curl http://0.0.0.0:4000/rerank \
## **Supported Providers**
#### ⚡See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
| Provider | Link to Usage |
|-------------|--------------------|
| Cohere (v1 + v2 clients) | [Usage](#quick-start) |

View File

@@ -3,8 +3,11 @@ import TabItem from '@theme/TabItem';
# /responses [Beta]
LiteLLM provides a BETA endpoint in the spec of [OpenAI's `/responses` API](https://platform.openai.com/docs/api-reference/responses)
Requests to /chat/completions may be bridged here automatically when the provider lacks support for that endpoint. The models default `mode` determines how bridging works.(see `model_prices_and_context_window`)
| Feature | Supported | Notes |
|---------|-----------|--------|
| Cost Tracking | ✅ | Works with all supported models |
@@ -78,6 +81,43 @@ print(retrieved_response)
# retrieved_response = await litellm.aget_responses(response_id=response_id)
```
#### CANCEL a Response
You can cancel an in-progress response (if supported by the provider):
```python showLineNumbers title="Cancel Response by ID"
import litellm
# First, create a response
response = litellm.responses(
model="openai/o1-pro",
input="Tell me a three sentence bedtime story about a unicorn.",
max_output_tokens=100
)
# Get the response ID
response_id = response.id
# Cancel the response by ID
cancel_response = litellm.cancel_responses(
response_id=response_id
)
print(cancel_response)
# For async usage
# cancel_response = await litellm.acancel_responses(response_id=response_id)
```
**REST API:**
```bash
curl -X POST http://localhost:4000/v1/responses/response_id/cancel \
-H "Authorization: Bearer sk-1234"
```
This will attempt to cancel the in-progress response with the given ID.
**Note:** Not all providers support response cancellation. If unsupported, an error will be raised.
#### DELETE a Response
```python showLineNumbers title="Delete Response by ID"
import litellm

Binary file not shown.

After

Width:  |  Height:  |  Size: 234 KiB

View File

@@ -57,32 +57,31 @@ const sidebars = {
type: "category",
label: "Alerting & Monitoring",
items: [
"proxy/prometheus",
"proxy/alerting",
"proxy/pagerduty"
].sort()
"proxy/pagerduty",
"proxy/prometheus"
]
},
{
type: "category",
label: "[Beta] Prompt Management",
items: [
"proxy/prompt_management",
"proxy/custom_prompt_management",
"proxy/native_litellm_prompt",
"proxy/custom_prompt_management"
].sort()
"proxy/prompt_management"
]
},
{
type: "category",
label: "AI Tools (OpenWebUI, Claude Code, etc.)",
items: [
"integrations/letta",
"tutorials/openweb_ui",
"tutorials/openai_codex",
"tutorials/litellm_gemini_cli",
"tutorials/litellm_qwen_code_cli",
"tutorials/github_copilot_integration",
"tutorials/claude_responses_api",
"tutorials/cost_tracking_coding",
"tutorials/github_copilot_integration",
"tutorials/litellm_gemini_cli",
"tutorials/litellm_qwen_code_cli",
"tutorials/openai_codex",
"tutorials/openweb_ui"
]
},
@@ -112,29 +111,115 @@ const sidebars = {
label: "Setup & Deployment",
items: [
"proxy/quick_start",
"proxy/user_onboarding",
"proxy/deploy",
"proxy/prod",
"proxy/cli",
"proxy/release_cycle",
"proxy/model_management",
"proxy/health",
"proxy/debugging",
"proxy/deploy",
"proxy/health",
"proxy/master_key_rotations",
"proxy/model_management",
"proxy/prod",
"proxy/release_cycle",
],
},
"proxy/demo",
{
type: "category",
label: "Admin UI",
items: [
"proxy/admin_ui_sso",
"proxy/custom_root_ui",
"proxy/custom_sso",
"proxy/model_hub",
"proxy/public_teams",
"proxy/self_serve",
"proxy/ui",
"proxy/ui/bulk_edit_users",
"proxy/ui_credentials",
"tutorials/scim_litellm",
{
type: "category",
label: "UI Logs",
items: [
"proxy/ui_logs",
"proxy/ui_logs_sessions"
]
}
],
},
{
type: "category",
label: "Architecture",
items: ["proxy/architecture", "proxy/control_plane_and_data_plane", "proxy/db_info", "proxy/db_deadlocks", "router_architecture", "proxy/user_management_heirarchy", "proxy/jwt_auth_arch", "proxy/image_handling", "proxy/spend_logs_deletion"],
items: [
"proxy/architecture",
"proxy/control_plane_and_data_plane",
"proxy/db_deadlocks",
"proxy/db_info",
"proxy/image_handling",
"proxy/jwt_auth_arch",
"proxy/spend_logs_deletion",
"proxy/user_management_heirarchy",
"router_architecture"
],
},
{
type: "link",
label: "All Endpoints (Swagger)",
href: "https://litellm-api.up.railway.app/",
},
"proxy/management_cli",
"proxy/enterprise",
"proxy/management_cli",
{
type: "category",
label: "Authentication",
items: [
"proxy/virtual_keys",
"proxy/token_auth",
"proxy/service_accounts",
"proxy/access_control",
"proxy/cli_sso",
"proxy/custom_auth",
"proxy/ip_address",
"proxy/email",
"proxy/multiple_admins",
],
},
{
type: "category",
label: "Budgets + Rate Limits",
items: [
"proxy/customers",
"proxy/dynamic_rate_limit",
"proxy/rate_limit_tiers",
"proxy/team_budgets",
"proxy/temporary_budget_increase",
"proxy/users"
],
},
"proxy/caching",
{
type: "category",
label: "Create Custom Plugins",
description: "Modify requests, responses, and more",
items: [
"proxy/call_hooks",
"proxy/rules",
]
},
{
type: "link",
label: "Load Balancing, Routing, Fallbacks",
href: "https://docs.litellm.ai/docs/routing-load-balancing",
},
{
type: "category",
label: "Logging, Alerting, Metrics",
items: [
"proxy/dynamic_logging",
"proxy/logging",
"proxy/logging_spec",
"proxy/team_logging"
],
},
{
type: "category",
label: "Making LLM Requests",
@@ -147,19 +232,6 @@ const sidebars = {
"proxy/model_discovery",
],
},
{
type: "category",
label: "Authentication",
items: [
"proxy/virtual_keys",
"proxy/token_auth",
"proxy/service_accounts",
"proxy/access_control",
"proxy/ip_address",
"proxy/email",
"proxy/custom_auth",
],
},
{
type: "category",
label: "Model Access",
@@ -168,73 +240,6 @@ const sidebars = {
"proxy/team_model_add"
]
},
{
type: "category",
label: "Spend Tracking",
items: ["proxy/cost_tracking", "proxy/custom_pricing", "proxy/billing",],
},
{
type: "category",
label: "Budgets + Rate Limits",
items: ["proxy/users", "proxy/temporary_budget_increase", "proxy/rate_limit_tiers", "proxy/team_budgets", "proxy/dynamic_rate_limit", "proxy/customers"],
},
{
type: "category",
label: "Enterprise Features",
items: [
"proxy/enterprise",
{
type: "category",
label: "Admin UI",
items: [
"proxy/ui",
"proxy/admin_ui_sso",
"proxy/custom_root_ui",
"proxy/model_hub",
"proxy/self_serve",
"proxy/public_teams",
"proxy/ui_credentials",
"proxy/ui/bulk_edit_users",
{
type: "category",
label: "UI Logs",
items: [
"proxy/ui_logs",
"proxy/ui_logs_sessions"
]
}
],
},
{
type: "category",
label: "SSO & Identity Management",
items: [
"proxy/cli_sso",
"proxy/admin_ui_sso",
"proxy/custom_sso",
"tutorials/scim_litellm",
"tutorials/msft_sso",
"proxy/multiple_admins",
],
},
],
},
{
type: "link",
label: "Load Balancing, Routing, Fallbacks",
href: "https://docs.litellm.ai/docs/routing-load-balancing",
},
{
type: "category",
label: "Logging, Alerting, Metrics",
items: [
"proxy/logging",
"proxy/logging_spec",
"proxy/team_logging",
"proxy/dynamic_logging"
],
},
{
type: "category",
label: "Secret Managers",
@@ -245,14 +250,13 @@ const sidebars = {
},
{
type: "category",
label: "Create Custom Plugins",
description: "Modify requests, responses, and more",
label: "Spend Tracking",
items: [
"proxy/call_hooks",
"proxy/rules",
]
"proxy/billing",
"proxy/cost_tracking",
"proxy/custom_pricing"
],
},
"proxy/caching",
]
},
{
@@ -266,13 +270,11 @@ const sidebars = {
slug: "/supported_endpoints",
},
items: [
"anthropic_unified",
"apply_guardrail",
"assistants",
{
type: "category",
label: "/audio",
"items": [
items: [
"audio_transcription",
"text_to_speech",
]
@@ -301,6 +303,7 @@ const sidebars = {
"completion/http_handler_config",
],
},
"text_completion",
"embedding/supported_embedding",
{
type: "category",
@@ -318,13 +321,14 @@ const sidebars = {
"proxy/managed_finetuning",
]
},
"generateContent",
"generateContent",
"apply_guardrail",
{
type: "category",
label: "/images",
items: [
"image_generation",
"image_edits",
"image_generation",
"image_variations",
]
},
@@ -335,23 +339,23 @@ const sidebars = {
label: "Pass-through Endpoints (Anthropic SDK, etc.)",
items: [
"pass_through/intro",
"pass_through/vertex_ai",
"pass_through/google_ai_studio",
"pass_through/anthropic_completion",
"pass_through/assembly_ai",
"pass_through/bedrock",
"pass_through/cohere",
"pass_through/vllm",
"pass_through/google_ai_studio",
"pass_through/langfuse",
"pass_through/mistral",
"pass_through/openai_passthrough",
"pass_through/anthropic_completion",
"pass_through/bedrock",
"pass_through/assembly_ai",
"pass_through/langfuse",
"proxy/pass_through",
],
"pass_through/vertex_ai",
"pass_through/vllm",
"proxy/pass_through"
]
},
"realtime",
"rerank",
"response_api",
"text_completion",
"anthropic_unified",
{
type: "category",
label: "/vector_stores",
@@ -398,7 +402,6 @@ const sidebars = {
items: [
"providers/azure_ai",
"providers/azure_ai_img",
"providers/azure_ai_img_edit",
]
},
{
@@ -515,33 +518,32 @@ const sidebars = {
type: "category",
label: "Guides",
items: [
"exception_mapping",
"completion/audio",
"completion/batching",
"completion/computer_use",
"completion/document_understanding",
"completion/drop_params",
"completion/function_call",
"completion/image_generation_chat",
"completion/json_mode",
"completion/knowledgebase",
"completion/message_trimming",
"completion/model_alias",
"completion/mock_requests",
"completion/predict_outputs",
"completion/prefix",
"completion/prompt_caching",
"completion/prompt_formatting",
"completion/reliable_completions",
"completion/stream",
"completion/provider_specific_params",
"completion/vision",
"completion/web_search",
"exception_mapping",
"guides/finetuned_models",
"guides/security_settings",
"completion/audio",
"completion/image_generation_chat",
"completion/web_search",
"completion/document_understanding",
"completion/vision",
"completion/json_mode",
"reasoning_content",
"completion/computer_use",
"completion/prompt_caching",
"completion/predict_outputs",
"completion/knowledgebase",
"completion/prefix",
"completion/drop_params",
"completion/prompt_formatting",
"completion/stream",
"completion/message_trimming",
"completion/function_call",
"completion/model_alias",
"completion/batching",
"completion/mock_requests",
"completion/reliable_completions",
"proxy/veo_video_generation",
"reasoning_content"
]
},
@@ -554,26 +556,35 @@ const sidebars = {
description: "Learn how to load balance, route, and set fallbacks for your LLM requests",
slug: "/routing-load-balancing",
},
items: ["routing", "scheduler", "proxy/load_balancing", "proxy/reliability", "proxy/timeout", "proxy/auto_routing", "proxy/tag_routing", "proxy/provider_budget_routing", "wildcard_routing"],
items: [
"routing",
"scheduler",
"proxy/auto_routing",
"proxy/load_balancing",
"proxy/provider_budget_routing",
"proxy/reliability",
"proxy/tag_routing",
"proxy/timeout",
"wildcard_routing"
],
},
{
type: "category",
label: "LiteLLM Python SDK",
items: [
"set_keys",
"completion/token_usage",
"sdk/headers",
"sdk_custom_pricing",
"embedding/async_embedding",
"embedding/moderation",
"budget_manager",
"caching/all_caches",
"completion/token_usage",
"embedding/async_embedding",
"embedding/moderation",
"migration",
"sdk_custom_pricing",
{
type: "category",
label: "LangChain, LlamaIndex, Instructor Integration",
items: ["langchain/langchain", "tutorials/instructor"],
},
}
],
},

View File

@@ -12,6 +12,11 @@
- For installation and configuration, see: [Self-hosting guided](https://docs.litellm.ai/docs/proxy/deploy)
- **Telemetry** We run no telemetry when you self host LiteLLM
:::info
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
:::
### LiteLLM Cloud
- We encrypt all data stored using your `LITELLM_MASTER_KEY` and in transit using TLS.