mirror of
https://github.com/BerriAI/litellm.git
synced 2025-12-06 11:33:26 +08:00
Corrected docs updates sept 2025 (#14916)
* docs: Corrected documentation updates from Sept 2025 This PR contains the actual intended documentation changes, properly synced with main: ✅ Real changes applied: - Added AWS authentication link to bedrock guardrails documentation - Updated Vertex AI with Gemini API alternative configuration - Added async_post_call_success_hook code snippet to custom callback docs - Added SSO free for up to 5 users information to enterprise and custom_sso docs - Added SSO free information block to security.md - Added cancel response API usage and curl example to response_api.md - Added image for modifying default user budget via admin UI - Re-ordered sidebars in documentation ❌ Sync issues resolved: - Kept all upstream changes that were added to main after branch diverged - Preserved Provider-Specific Metadata Parameters section that was added upstream - Maintained proper curl parameter formatting (-d instead of -D) This corrects the sync issues from the original PR #14769. * docs: Restore missing files from original PR Added back ~16 missing documentation files that were part of the original PR: ✅ Restored files: - docs/my-website/docs/completion/usage.md - docs/my-website/docs/fine_tuning.md - docs/my-website/docs/getting_started.md - docs/my-website/docs/image_edits.md - docs/my-website/docs/image_generation.md - docs/my-website/docs/index.md - docs/my-website/docs/moderation.md - docs/my-website/docs/observability/callbacks.md - docs/my-website/docs/providers/bedrock.md - docs/my-website/docs/proxy/caching.md - docs/my-website/docs/proxy/config_settings.md - docs/my-website/docs/proxy/db_deadlocks.md - docs/my-website/docs/proxy/load_balancing.md - docs/my-website/docs/proxy_api.md - docs/my-website/docs/rerank.md ✅ Fixed context-caching issue: - Restored provider_specific_params.md to main version (preserving Provider-Specific Metadata Parameters section) - Your original PR didn't intend to modify this file - it was just a sync issue Now includes all ~26 documentation files from the original PR #14769. * docs: Remove files that were deleted in original PR - Removed docs/my-website/docs/providers/azure_ai_img_edit.md (was deleted in original PR) - sdk/headers.md was already not present Now matches the complete intended changes from original PR #14769. * docs: Restore azure_ai_img_edit.md from main - Restored docs/my-website/docs/providers/azure_ai_img_edit.md from main branch - This file should not have been deleted as it was a newer commit - SDK headers file doesn't exist in main (was reverted) and wasn't part of your original changes Fixes the file restoration issues. * docs: Fix vertex.md - preserve context caching from newer commit - Restored vertex.md to main version to preserve context caching content (lines 817-887) - Added back only your intended change: alternative gemini config example - Context caching content from newer commit is now preserved Fixes the vertex.md sync issue where newer content was incorrectly deleted. * docs: Fix providers/bedrock.md - restore deleted content from newer commit - Restored providers/bedrock.md to main version - Preserves 'Usage - Request Metadata' section that was added in newer commit - Your actual intended change was to proxy/guardrails/bedrock.md (authentication tip) which is preserved - Now only has additions, no subtractions as intended Fixes the bedrock.md sync issue. * docs: Restore missing IAM policy section in bedrock.md Added back your intended IAM policy documentation that was lost when restoring main version: ✅ Added IAM AssumeRole Policy section: - Explains requirement for sts:AssumeRole permission - Shows error message example when permission missing - Provides complete IAM policy JSON example - Links to AWS AssumeRole documentation - Clarifies trust policy requirements Now bedrock.md has both: - All newer content preserved (Request Metadata section) - Your intended IAM policy addition restored --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
This commit is contained in:
@@ -26,6 +26,7 @@ response = completion(
|
||||
|
||||
print(response.usage)
|
||||
```
|
||||
> **Note:** LiteLLM supports endpoint bridging—if a model does not natively support a requested endpoint, LiteLLM will automatically route the call to the correct supported endpoint (such as bridging `/chat/completions` to `/responses` or vice versa) based on the model's `mode`set in `model_prices_and_context_window`.
|
||||
|
||||
## Streaming Usage
|
||||
|
||||
|
||||
@@ -1,6 +1,11 @@
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# Enterprise
|
||||
|
||||
:::info
|
||||
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
|
||||
:::
|
||||
|
||||
For companies that need SSO, user management and professional support for LiteLLM Proxy
|
||||
|
||||
:::info
|
||||
|
||||
@@ -13,6 +13,8 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c
|
||||
| Feature | Supported | Notes |
|
||||
|-------|-------|-------|
|
||||
| Supported Providers | OpenAI, Azure OpenAI, Vertex AI | - |
|
||||
|
||||
#### ⚡️See an exhaustive list of supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
|
||||
| Cost Tracking | 🟡 | [Let us know if you need this](https://github.com/BerriAI/litellm/issues) |
|
||||
| Logging | ✅ | Works across all logging integrations |
|
||||
|
||||
|
||||
@@ -32,7 +32,8 @@ Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./
|
||||
More details 👉
|
||||
|
||||
- [Completion() function details](./completion/)
|
||||
- [All supported models / providers on LiteLLM](./providers/)
|
||||
- [Overview of supported models / providers on LiteLLM](./providers/)
|
||||
- [Search all models / providers](https://models.litellm.ai/)
|
||||
- [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main)
|
||||
|
||||
## streaming
|
||||
|
||||
@@ -18,6 +18,9 @@ LiteLLM provides image editing functionality that maps to OpenAI's `/images/edit
|
||||
| Supported LiteLLM Proxy Versions | 1.71.1+ | |
|
||||
| Supported LLM providers | **OpenAI** | Currently only `openai` is supported |
|
||||
|
||||
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
### LiteLLM Python SDK
|
||||
|
||||
@@ -279,6 +279,8 @@ print(f"response: {response}")
|
||||
|
||||
## Supported Providers
|
||||
|
||||
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
|
||||
|
||||
| Provider | Documentation Link |
|
||||
|----------|-------------------|
|
||||
| OpenAI | [OpenAI Image Generation →](./providers/openai) |
|
||||
|
||||
@@ -524,6 +524,15 @@ try:
|
||||
except OpenAIError as e:
|
||||
print(e)
|
||||
```
|
||||
### See How LiteLLM Transforms Your Requests
|
||||
|
||||
Want to understand how LiteLLM parses and normalizes your LLM API requests? Use the `/utils/transform_request` endpoint to see exactly how your request is transformed internally.
|
||||
|
||||
You can try it out now directly on our Demo App!
|
||||
Go to the [LiteLLM API docs for transform_request](https://litellm-api.up.railway.app/#/llm%20utils/transform_request_utils_transform_request_post)
|
||||
|
||||
LiteLLM will show you the normalized, provider-agnostic version of your request. This is useful for debugging, learning, and understanding how LiteLLM handles different providers and options.
|
||||
|
||||
|
||||
### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
||||
LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, Helicone, Promptlayer, Traceloop, Slack
|
||||
|
||||
@@ -130,6 +130,8 @@ Here's the exact json output and type you can expect from all moderation calls:
|
||||
|
||||
## **Supported Providers**
|
||||
|
||||
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
|
||||
|
||||
| Provider |
|
||||
|-------------|
|
||||
| OpenAI |
|
||||
|
||||
@@ -5,13 +5,15 @@
|
||||
liteLLM provides `input_callbacks`, `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
|
||||
|
||||
:::tip
|
||||
**New to LiteLLM Callbacks?** Check out our comprehensive [Callback Management Guide](./callback_management.md) to understand when to use different callback hooks like `async_log_success_event` vs `async_post_call_success_hook`.
|
||||
**New to LiteLLM Callbacks?**
|
||||
|
||||
- For proxy/server logging and observability, see the [Proxy Logging Guide](https://docs.litellm.ai/docs/proxy/logging).
|
||||
- To write your own callback logic, see the [Custom Callbacks Guide](https://docs.litellm.ai/docs/observability/custom_callback).
|
||||
:::
|
||||
|
||||
liteLLM supports:
|
||||
|
||||
- [Custom Callback Functions](https://docs.litellm.ai/docs/observability/custom_callback)
|
||||
- [Callback Management Guide](./callback_management.md) - **Comprehensive guide for choosing the right hooks**
|
||||
### Supported Callback Integrations
|
||||
|
||||
- [Lunary](https://lunary.ai/docs)
|
||||
- [Langfuse](https://langfuse.com/docs)
|
||||
- [LangSmith](https://www.langchain.com/langsmith)
|
||||
@@ -21,9 +23,20 @@ liteLLM supports:
|
||||
- [Sentry](https://docs.sentry.io/platforms/python/)
|
||||
- [PostHog](https://posthog.com/docs/libraries/python)
|
||||
- [Slack](https://slack.dev/bolt-python/concepts)
|
||||
- [Arize](https://docs.arize.com/)
|
||||
- [PromptLayer](https://docs.promptlayer.com/)
|
||||
|
||||
This is **not** an extensive list. Please check the dropdown for all logging integrations.
|
||||
|
||||
### Related Cookbooks
|
||||
Try out our cookbooks for code snippets and interactive demos:
|
||||
|
||||
- [Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Langfuse.ipynb)
|
||||
- [Lunary Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Lunary.ipynb)
|
||||
- [Arize Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Arize.ipynb)
|
||||
- [Proxy + Langfuse Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/logging_observability/LiteLLM_Proxy_Langfuse.ipynb)
|
||||
- [PromptLayer Callback Example (Colab)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/LiteLLM_PromptLayer.ipynb)
|
||||
|
||||
### Quick Start
|
||||
|
||||
```python
|
||||
|
||||
@@ -67,6 +67,23 @@ asyncio.run(completion())
|
||||
- `async_post_call_success_hook` - Access user data + modify responses
|
||||
- `async_pre_call_hook` - Modify requests before sending
|
||||
|
||||
### Example: Modifying the Response in async_post_call_success_hook
|
||||
|
||||
You can use `async_post_call_success_hook` to add custom headers or metadata to the response before it is returned to the client. For example:
|
||||
|
||||
```python
|
||||
async def async_post_call_success_hook(data, user_api_key_dict, response):
|
||||
# Add a custom header to the response
|
||||
additional_headers = getattr(response, "_hidden_params", {}).get("additional_headers", {}) or {}
|
||||
additional_headers["x-litellm-custom-header"] = "my-value"
|
||||
if not hasattr(response, "_hidden_params"):
|
||||
response._hidden_params = {}
|
||||
response._hidden_params["additional_headers"] = additional_headers
|
||||
return response
|
||||
```
|
||||
|
||||
This allows you to inject custom metadata or headers into the response for downstream consumers. You can use this pattern to pass information to clients, proxies, or observability tools.
|
||||
|
||||
## Callback Functions
|
||||
If you just want to log on a specific event (e.g. on input) - you can use callback functions.
|
||||
|
||||
|
||||
@@ -2340,6 +2340,39 @@ response = completion(
|
||||
|
||||
Make the bedrock completion call
|
||||
|
||||
---
|
||||
|
||||
### Required AWS IAM Policy for AssumeRole
|
||||
|
||||
To use `aws_role_name` (STS AssumeRole) with LiteLLM, your IAM user or role **must** have permission to call `sts:AssumeRole` on the target role. If you see an error like:
|
||||
|
||||
```
|
||||
An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::...:assumed-role/litellm-ecs-task-role/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::...:role/Enterprise/BedrockCrossAccountConsumer
|
||||
```
|
||||
|
||||
This means the IAM identity running LiteLLM does **not** have permission to assume the target role. You must update your IAM policy to allow this action.
|
||||
|
||||
#### Example IAM Policy
|
||||
|
||||
Replace `<TARGET_ROLE_ARN>` with the ARN of the role you want to assume (e.g., `arn:aws:iam::123456789012:role/Enterprise/BedrockCrossAccountConsumer`).
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": "sts:AssumeRole",
|
||||
"Resource": "<TARGET_ROLE_ARN>"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Note:** The target role itself must also trust the calling IAM identity (via its trust policy) for AssumeRole to succeed. See [AWS AssumeRole docs](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-api.html) for more details.
|
||||
|
||||
---
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="sdk" label="SDK">
|
||||
|
||||
|
||||
@@ -196,6 +196,19 @@ model_list:
|
||||
vertex_location: "us-central1"
|
||||
vertex_credentials: "/path/to/service_account.json" # [OPTIONAL] Do this OR `!gcloud auth application-default login` - run this to add vertex credentials to your env
|
||||
```
|
||||
or
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: gemini-pro
|
||||
litellm_params:
|
||||
model: vertex_ai/gemini-1.5-pro
|
||||
litellm_credential_name: vertex-global
|
||||
vertex_project: project-name-here
|
||||
vertex_location: global
|
||||
base_model: gemini
|
||||
model_info:
|
||||
provider: Vertex
|
||||
```
|
||||
|
||||
2. Start Proxy
|
||||
|
||||
|
||||
@@ -958,6 +958,19 @@ curl http://localhost:4000/v1/chat/completions \
|
||||
|
||||
</Tabs>
|
||||
|
||||
|
||||
## Redis max_connections
|
||||
|
||||
You can set the `max_connections` parameter in your `cache_params` for Redis. This is passed directly to the Redis client and controls the maximum number of simultaneous connections in the pool. If you see errors like `No connection available`, try increasing this value:
|
||||
|
||||
```yaml
|
||||
litellm_settings:
|
||||
cache: true
|
||||
cache_params:
|
||||
type: redis
|
||||
max_connections: 100
|
||||
```
|
||||
|
||||
## Supported `cache_params` on proxy config.yaml
|
||||
|
||||
```yaml
|
||||
@@ -966,6 +979,7 @@ cache_params:
|
||||
ttl: Optional[float]
|
||||
default_in_memory_ttl: Optional[float]
|
||||
default_in_redis_ttl: Optional[float]
|
||||
max_connections: Optional[Int]
|
||||
|
||||
# Type of cache (options: "local", "redis", "s3")
|
||||
type: s3
|
||||
|
||||
@@ -50,6 +50,7 @@ litellm_settings:
|
||||
port: 6379 # The port number for the Redis cache. Required if type is "redis".
|
||||
password: "your_password" # The password for the Redis cache. Required if type is "redis".
|
||||
namespace: "litellm.caching.caching" # namespace for redis cache
|
||||
max_connections: 100 # [OPTIONAL] Set Maximum number of Redis connections. Passed directly to redis-py.
|
||||
|
||||
# Optional - Redis Cluster Settings
|
||||
redis_startup_nodes: [{"host": "127.0.0.1", "port": "7001"}]
|
||||
|
||||
@@ -1,9 +1,7 @@
|
||||
# ✨ Event Hooks for SSO Login
|
||||
|
||||
:::info
|
||||
|
||||
✨ This is an Enterprise only feature [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
|
||||
|
||||
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
|
||||
:::
|
||||
|
||||
## Overview
|
||||
|
||||
@@ -84,3 +84,29 @@ LiteLLM emits the following prometheus metrics to monitor the health/status of t
|
||||
| `litellm_in_memory_spend_update_queue_size` | In-memory aggregate spend values for keys, users, teams, team members, etc.| In-Memory |
|
||||
| `litellm_redis_spend_update_queue_size` | Redis aggregate spend values for keys, users, teams, etc. | Redis |
|
||||
|
||||
|
||||
## Troubleshooting: Redis Connection Errors
|
||||
|
||||
You may see errors like:
|
||||
|
||||
```
|
||||
LiteLLM Redis Caching: async async_increment() - Got exception from REDIS No connection available., Writing value=21
|
||||
LiteLLM Redis Caching: async set_cache_pipeline() - Got exception from REDIS No connection available., Writing value=None
|
||||
```
|
||||
|
||||
This means all available Redis connections are in use, and LiteLLM cannot obtain a new connection from the pool. This can happen under high load or with many concurrent proxy requests.
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Increase the `max_connections` parameter in your Redis config section in `proxy_config.yaml` to allow more simultaneous connections. For example:
|
||||
|
||||
```yaml
|
||||
litellm_settings:
|
||||
cache: True
|
||||
cache_params:
|
||||
type: redis
|
||||
max_connections: 100 # Increase as needed for your traffic
|
||||
```
|
||||
|
||||
Adjust this value based on your expected concurrency and Redis server capacity.
|
||||
|
||||
|
||||
@@ -4,6 +4,10 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
# Bedrock Guardrails
|
||||
|
||||
:::tip ⚡️
|
||||
If you haven't set up or authenticated your Bedrock provider yet, see the [Bedrock Provider Setup & Authentication Guide](../../providers/bedrock.md).
|
||||
:::
|
||||
|
||||
LiteLLM supports Bedrock guardrails via the [Bedrock ApplyGuardrail API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ApplyGuardrail.html).
|
||||
|
||||
## Quick Start
|
||||
|
||||
@@ -172,6 +172,9 @@ router_settings:
|
||||
redis_host: <your redis host>
|
||||
redis_password: <your redis password>
|
||||
redis_port: 1992
|
||||
cache_params:
|
||||
type: redis
|
||||
max_connections: 100 # maximum Redis connections in the pool; tune based on expected concurrency/load
|
||||
```
|
||||
|
||||
## Router settings on config - routing_strategy, model_group_alias
|
||||
|
||||
@@ -227,7 +227,7 @@ export PROXY_LOGOUT_URL="https://www.google.com"
|
||||
<Image img={require('../../img/ui_logout.png')} style={{ width: '400px', height: 'auto' }} />
|
||||
|
||||
|
||||
### Set max budget for internal users
|
||||
### Set default max budget for internal users
|
||||
|
||||
Automatically apply budget per internal user when they sign up. By default the table will be checked every 10 minutes, for users to reset. To modify this, [see this](./users.md#reset-budgets)
|
||||
|
||||
@@ -239,6 +239,10 @@ litellm_settings:
|
||||
|
||||
This sets a max budget of $10 USD for internal users when they sign up.
|
||||
|
||||
You can also manage these settings visually in the UI:
|
||||
|
||||
<Image img={require('../../img/default_user_settings_admin_ui.png')} style={{ width: '700px', height: 'auto' }} />
|
||||
|
||||
This budget only applies to personal keys created by that user - seen under `Default Team` on the UI.
|
||||
|
||||
<Image img={require('../../img/max_budget_for_internal_users.png')} style={{ width: '500px', height: 'auto' }} />
|
||||
|
||||
@@ -27,7 +27,7 @@ Email us @ krrish@berri.ai
|
||||
## Supported Models for LiteLLM Key
|
||||
These are the models that currently work with the "sk-litellm-.." keys.
|
||||
|
||||
For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/)
|
||||
For a complete list of models/providers that you can call with LiteLLM, [check out our provider list](./providers/) or check out [models.litellm.ai](https://models.litellm.ai/)
|
||||
|
||||
* OpenAI models - [OpenAI docs](./providers/openai.md)
|
||||
* gpt-4
|
||||
|
||||
@@ -109,6 +109,8 @@ curl http://0.0.0.0:4000/rerank \
|
||||
|
||||
## **Supported Providers**
|
||||
|
||||
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
|
||||
|
||||
| Provider | Link to Usage |
|
||||
|-------------|--------------------|
|
||||
| Cohere (v1 + v2 clients) | [Usage](#quick-start) |
|
||||
|
||||
@@ -3,8 +3,11 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
# /responses [Beta]
|
||||
|
||||
|
||||
LiteLLM provides a BETA endpoint in the spec of [OpenAI's `/responses` API](https://platform.openai.com/docs/api-reference/responses)
|
||||
|
||||
Requests to /chat/completions may be bridged here automatically when the provider lacks support for that endpoint. The model’s default `mode` determines how bridging works.(see `model_prices_and_context_window`)
|
||||
|
||||
| Feature | Supported | Notes |
|
||||
|---------|-----------|--------|
|
||||
| Cost Tracking | ✅ | Works with all supported models |
|
||||
@@ -78,6 +81,43 @@ print(retrieved_response)
|
||||
# retrieved_response = await litellm.aget_responses(response_id=response_id)
|
||||
```
|
||||
|
||||
#### CANCEL a Response
|
||||
You can cancel an in-progress response (if supported by the provider):
|
||||
|
||||
```python showLineNumbers title="Cancel Response by ID"
|
||||
import litellm
|
||||
|
||||
# First, create a response
|
||||
response = litellm.responses(
|
||||
model="openai/o1-pro",
|
||||
input="Tell me a three sentence bedtime story about a unicorn.",
|
||||
max_output_tokens=100
|
||||
)
|
||||
|
||||
# Get the response ID
|
||||
response_id = response.id
|
||||
|
||||
# Cancel the response by ID
|
||||
cancel_response = litellm.cancel_responses(
|
||||
response_id=response_id
|
||||
)
|
||||
|
||||
print(cancel_response)
|
||||
|
||||
# For async usage
|
||||
# cancel_response = await litellm.acancel_responses(response_id=response_id)
|
||||
```
|
||||
|
||||
|
||||
**REST API:**
|
||||
```bash
|
||||
curl -X POST http://localhost:4000/v1/responses/response_id/cancel \
|
||||
-H "Authorization: Bearer sk-1234"
|
||||
```
|
||||
|
||||
This will attempt to cancel the in-progress response with the given ID.
|
||||
**Note:** Not all providers support response cancellation. If unsupported, an error will be raised.
|
||||
|
||||
#### DELETE a Response
|
||||
```python showLineNumbers title="Delete Response by ID"
|
||||
import litellm
|
||||
|
||||
BIN
docs/my-website/img/default_user_settings_admin_ui.png
Normal file
BIN
docs/my-website/img/default_user_settings_admin_ui.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 234 KiB |
@@ -57,32 +57,31 @@ const sidebars = {
|
||||
type: "category",
|
||||
label: "Alerting & Monitoring",
|
||||
items: [
|
||||
"proxy/prometheus",
|
||||
"proxy/alerting",
|
||||
"proxy/pagerduty"
|
||||
].sort()
|
||||
"proxy/pagerduty",
|
||||
"proxy/prometheus"
|
||||
]
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "[Beta] Prompt Management",
|
||||
items: [
|
||||
"proxy/prompt_management",
|
||||
"proxy/custom_prompt_management",
|
||||
"proxy/native_litellm_prompt",
|
||||
"proxy/custom_prompt_management"
|
||||
].sort()
|
||||
"proxy/prompt_management"
|
||||
]
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "AI Tools (OpenWebUI, Claude Code, etc.)",
|
||||
items: [
|
||||
"integrations/letta",
|
||||
"tutorials/openweb_ui",
|
||||
"tutorials/openai_codex",
|
||||
"tutorials/litellm_gemini_cli",
|
||||
"tutorials/litellm_qwen_code_cli",
|
||||
"tutorials/github_copilot_integration",
|
||||
"tutorials/claude_responses_api",
|
||||
"tutorials/cost_tracking_coding",
|
||||
"tutorials/github_copilot_integration",
|
||||
"tutorials/litellm_gemini_cli",
|
||||
"tutorials/litellm_qwen_code_cli",
|
||||
"tutorials/openai_codex",
|
||||
"tutorials/openweb_ui"
|
||||
]
|
||||
},
|
||||
|
||||
@@ -112,29 +111,115 @@ const sidebars = {
|
||||
label: "Setup & Deployment",
|
||||
items: [
|
||||
"proxy/quick_start",
|
||||
"proxy/user_onboarding",
|
||||
"proxy/deploy",
|
||||
"proxy/prod",
|
||||
"proxy/cli",
|
||||
"proxy/release_cycle",
|
||||
"proxy/model_management",
|
||||
"proxy/health",
|
||||
"proxy/debugging",
|
||||
"proxy/deploy",
|
||||
"proxy/health",
|
||||
"proxy/master_key_rotations",
|
||||
"proxy/model_management",
|
||||
"proxy/prod",
|
||||
"proxy/release_cycle",
|
||||
],
|
||||
},
|
||||
"proxy/demo",
|
||||
{
|
||||
type: "category",
|
||||
label: "Admin UI",
|
||||
items: [
|
||||
"proxy/admin_ui_sso",
|
||||
"proxy/custom_root_ui",
|
||||
"proxy/custom_sso",
|
||||
"proxy/model_hub",
|
||||
"proxy/public_teams",
|
||||
"proxy/self_serve",
|
||||
"proxy/ui",
|
||||
"proxy/ui/bulk_edit_users",
|
||||
"proxy/ui_credentials",
|
||||
"tutorials/scim_litellm",
|
||||
{
|
||||
type: "category",
|
||||
label: "UI Logs",
|
||||
items: [
|
||||
"proxy/ui_logs",
|
||||
"proxy/ui_logs_sessions"
|
||||
]
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Architecture",
|
||||
items: ["proxy/architecture", "proxy/control_plane_and_data_plane", "proxy/db_info", "proxy/db_deadlocks", "router_architecture", "proxy/user_management_heirarchy", "proxy/jwt_auth_arch", "proxy/image_handling", "proxy/spend_logs_deletion"],
|
||||
items: [
|
||||
"proxy/architecture",
|
||||
"proxy/control_plane_and_data_plane",
|
||||
"proxy/db_deadlocks",
|
||||
"proxy/db_info",
|
||||
"proxy/image_handling",
|
||||
"proxy/jwt_auth_arch",
|
||||
"proxy/spend_logs_deletion",
|
||||
"proxy/user_management_heirarchy",
|
||||
"router_architecture"
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "link",
|
||||
label: "All Endpoints (Swagger)",
|
||||
href: "https://litellm-api.up.railway.app/",
|
||||
},
|
||||
"proxy/management_cli",
|
||||
"proxy/enterprise",
|
||||
"proxy/management_cli",
|
||||
{
|
||||
type: "category",
|
||||
label: "Authentication",
|
||||
items: [
|
||||
"proxy/virtual_keys",
|
||||
"proxy/token_auth",
|
||||
"proxy/service_accounts",
|
||||
"proxy/access_control",
|
||||
"proxy/cli_sso",
|
||||
"proxy/custom_auth",
|
||||
"proxy/ip_address",
|
||||
"proxy/email",
|
||||
"proxy/multiple_admins",
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Budgets + Rate Limits",
|
||||
items: [
|
||||
"proxy/customers",
|
||||
"proxy/dynamic_rate_limit",
|
||||
"proxy/rate_limit_tiers",
|
||||
"proxy/team_budgets",
|
||||
"proxy/temporary_budget_increase",
|
||||
"proxy/users"
|
||||
],
|
||||
},
|
||||
"proxy/caching",
|
||||
{
|
||||
type: "category",
|
||||
label: "Create Custom Plugins",
|
||||
description: "Modify requests, responses, and more",
|
||||
items: [
|
||||
"proxy/call_hooks",
|
||||
"proxy/rules",
|
||||
]
|
||||
},
|
||||
{
|
||||
type: "link",
|
||||
label: "Load Balancing, Routing, Fallbacks",
|
||||
href: "https://docs.litellm.ai/docs/routing-load-balancing",
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Logging, Alerting, Metrics",
|
||||
items: [
|
||||
"proxy/dynamic_logging",
|
||||
"proxy/logging",
|
||||
"proxy/logging_spec",
|
||||
"proxy/team_logging"
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Making LLM Requests",
|
||||
@@ -147,19 +232,6 @@ const sidebars = {
|
||||
"proxy/model_discovery",
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Authentication",
|
||||
items: [
|
||||
"proxy/virtual_keys",
|
||||
"proxy/token_auth",
|
||||
"proxy/service_accounts",
|
||||
"proxy/access_control",
|
||||
"proxy/ip_address",
|
||||
"proxy/email",
|
||||
"proxy/custom_auth",
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Model Access",
|
||||
@@ -168,73 +240,6 @@ const sidebars = {
|
||||
"proxy/team_model_add"
|
||||
]
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Spend Tracking",
|
||||
items: ["proxy/cost_tracking", "proxy/custom_pricing", "proxy/billing",],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Budgets + Rate Limits",
|
||||
items: ["proxy/users", "proxy/temporary_budget_increase", "proxy/rate_limit_tiers", "proxy/team_budgets", "proxy/dynamic_rate_limit", "proxy/customers"],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Enterprise Features",
|
||||
items: [
|
||||
"proxy/enterprise",
|
||||
{
|
||||
type: "category",
|
||||
label: "Admin UI",
|
||||
items: [
|
||||
"proxy/ui",
|
||||
"proxy/admin_ui_sso",
|
||||
"proxy/custom_root_ui",
|
||||
"proxy/model_hub",
|
||||
"proxy/self_serve",
|
||||
"proxy/public_teams",
|
||||
"proxy/ui_credentials",
|
||||
"proxy/ui/bulk_edit_users",
|
||||
{
|
||||
type: "category",
|
||||
label: "UI Logs",
|
||||
items: [
|
||||
"proxy/ui_logs",
|
||||
"proxy/ui_logs_sessions"
|
||||
]
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "SSO & Identity Management",
|
||||
items: [
|
||||
"proxy/cli_sso",
|
||||
"proxy/admin_ui_sso",
|
||||
"proxy/custom_sso",
|
||||
"tutorials/scim_litellm",
|
||||
"tutorials/msft_sso",
|
||||
"proxy/multiple_admins",
|
||||
],
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "link",
|
||||
label: "Load Balancing, Routing, Fallbacks",
|
||||
href: "https://docs.litellm.ai/docs/routing-load-balancing",
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Logging, Alerting, Metrics",
|
||||
items: [
|
||||
"proxy/logging",
|
||||
"proxy/logging_spec",
|
||||
"proxy/team_logging",
|
||||
"proxy/dynamic_logging"
|
||||
],
|
||||
},
|
||||
|
||||
{
|
||||
type: "category",
|
||||
label: "Secret Managers",
|
||||
@@ -245,14 +250,13 @@ const sidebars = {
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "Create Custom Plugins",
|
||||
description: "Modify requests, responses, and more",
|
||||
label: "Spend Tracking",
|
||||
items: [
|
||||
"proxy/call_hooks",
|
||||
"proxy/rules",
|
||||
]
|
||||
"proxy/billing",
|
||||
"proxy/cost_tracking",
|
||||
"proxy/custom_pricing"
|
||||
],
|
||||
},
|
||||
"proxy/caching",
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -266,13 +270,11 @@ const sidebars = {
|
||||
slug: "/supported_endpoints",
|
||||
},
|
||||
items: [
|
||||
"anthropic_unified",
|
||||
"apply_guardrail",
|
||||
"assistants",
|
||||
{
|
||||
type: "category",
|
||||
label: "/audio",
|
||||
"items": [
|
||||
items: [
|
||||
"audio_transcription",
|
||||
"text_to_speech",
|
||||
]
|
||||
@@ -301,6 +303,7 @@ const sidebars = {
|
||||
"completion/http_handler_config",
|
||||
],
|
||||
},
|
||||
"text_completion",
|
||||
"embedding/supported_embedding",
|
||||
{
|
||||
type: "category",
|
||||
@@ -318,13 +321,14 @@ const sidebars = {
|
||||
"proxy/managed_finetuning",
|
||||
]
|
||||
},
|
||||
"generateContent",
|
||||
"generateContent",
|
||||
"apply_guardrail",
|
||||
{
|
||||
type: "category",
|
||||
label: "/images",
|
||||
items: [
|
||||
"image_generation",
|
||||
"image_edits",
|
||||
"image_generation",
|
||||
"image_variations",
|
||||
]
|
||||
},
|
||||
@@ -335,23 +339,23 @@ const sidebars = {
|
||||
label: "Pass-through Endpoints (Anthropic SDK, etc.)",
|
||||
items: [
|
||||
"pass_through/intro",
|
||||
"pass_through/vertex_ai",
|
||||
"pass_through/google_ai_studio",
|
||||
"pass_through/anthropic_completion",
|
||||
"pass_through/assembly_ai",
|
||||
"pass_through/bedrock",
|
||||
"pass_through/cohere",
|
||||
"pass_through/vllm",
|
||||
"pass_through/google_ai_studio",
|
||||
"pass_through/langfuse",
|
||||
"pass_through/mistral",
|
||||
"pass_through/openai_passthrough",
|
||||
"pass_through/anthropic_completion",
|
||||
"pass_through/bedrock",
|
||||
"pass_through/assembly_ai",
|
||||
"pass_through/langfuse",
|
||||
"proxy/pass_through",
|
||||
],
|
||||
"pass_through/vertex_ai",
|
||||
"pass_through/vllm",
|
||||
"proxy/pass_through"
|
||||
]
|
||||
},
|
||||
"realtime",
|
||||
"rerank",
|
||||
"response_api",
|
||||
"text_completion",
|
||||
"anthropic_unified",
|
||||
{
|
||||
type: "category",
|
||||
label: "/vector_stores",
|
||||
@@ -398,7 +402,6 @@ const sidebars = {
|
||||
items: [
|
||||
"providers/azure_ai",
|
||||
"providers/azure_ai_img",
|
||||
"providers/azure_ai_img_edit",
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -515,33 +518,32 @@ const sidebars = {
|
||||
type: "category",
|
||||
label: "Guides",
|
||||
items: [
|
||||
"exception_mapping",
|
||||
"completion/audio",
|
||||
"completion/batching",
|
||||
"completion/computer_use",
|
||||
"completion/document_understanding",
|
||||
"completion/drop_params",
|
||||
"completion/function_call",
|
||||
"completion/image_generation_chat",
|
||||
"completion/json_mode",
|
||||
"completion/knowledgebase",
|
||||
"completion/message_trimming",
|
||||
"completion/model_alias",
|
||||
"completion/mock_requests",
|
||||
"completion/predict_outputs",
|
||||
"completion/prefix",
|
||||
"completion/prompt_caching",
|
||||
"completion/prompt_formatting",
|
||||
"completion/reliable_completions",
|
||||
"completion/stream",
|
||||
"completion/provider_specific_params",
|
||||
"completion/vision",
|
||||
"completion/web_search",
|
||||
"exception_mapping",
|
||||
"guides/finetuned_models",
|
||||
"guides/security_settings",
|
||||
"completion/audio",
|
||||
"completion/image_generation_chat",
|
||||
"completion/web_search",
|
||||
"completion/document_understanding",
|
||||
"completion/vision",
|
||||
"completion/json_mode",
|
||||
"reasoning_content",
|
||||
"completion/computer_use",
|
||||
"completion/prompt_caching",
|
||||
"completion/predict_outputs",
|
||||
"completion/knowledgebase",
|
||||
"completion/prefix",
|
||||
"completion/drop_params",
|
||||
"completion/prompt_formatting",
|
||||
"completion/stream",
|
||||
"completion/message_trimming",
|
||||
"completion/function_call",
|
||||
"completion/model_alias",
|
||||
"completion/batching",
|
||||
"completion/mock_requests",
|
||||
"completion/reliable_completions",
|
||||
"proxy/veo_video_generation",
|
||||
|
||||
"reasoning_content"
|
||||
]
|
||||
},
|
||||
|
||||
@@ -554,26 +556,35 @@ const sidebars = {
|
||||
description: "Learn how to load balance, route, and set fallbacks for your LLM requests",
|
||||
slug: "/routing-load-balancing",
|
||||
},
|
||||
items: ["routing", "scheduler", "proxy/load_balancing", "proxy/reliability", "proxy/timeout", "proxy/auto_routing", "proxy/tag_routing", "proxy/provider_budget_routing", "wildcard_routing"],
|
||||
items: [
|
||||
"routing",
|
||||
"scheduler",
|
||||
"proxy/auto_routing",
|
||||
"proxy/load_balancing",
|
||||
"proxy/provider_budget_routing",
|
||||
"proxy/reliability",
|
||||
"proxy/tag_routing",
|
||||
"proxy/timeout",
|
||||
"wildcard_routing"
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "category",
|
||||
label: "LiteLLM Python SDK",
|
||||
items: [
|
||||
"set_keys",
|
||||
"completion/token_usage",
|
||||
"sdk/headers",
|
||||
"sdk_custom_pricing",
|
||||
"embedding/async_embedding",
|
||||
"embedding/moderation",
|
||||
"budget_manager",
|
||||
"caching/all_caches",
|
||||
"completion/token_usage",
|
||||
"embedding/async_embedding",
|
||||
"embedding/moderation",
|
||||
"migration",
|
||||
"sdk_custom_pricing",
|
||||
{
|
||||
type: "category",
|
||||
label: "LangChain, LlamaIndex, Instructor Integration",
|
||||
items: ["langchain/langchain", "tutorials/instructor"],
|
||||
},
|
||||
}
|
||||
],
|
||||
},
|
||||
|
||||
|
||||
@@ -12,6 +12,11 @@
|
||||
- For installation and configuration, see: [Self-hosting guided](https://docs.litellm.ai/docs/proxy/deploy)
|
||||
- **Telemetry** We run no telemetry when you self host LiteLLM
|
||||
|
||||
|
||||
:::info
|
||||
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
|
||||
:::
|
||||
|
||||
### LiteLLM Cloud
|
||||
|
||||
- We encrypt all data stored using your `LITELLM_MASTER_KEY` and in transit using TLS.
|
||||
|
||||
Reference in New Issue
Block a user