[Feat] Add cost tracking for Search API requests - Google PSE, Tavily, Parallel AI, Exa AI (#15821)

* add search cost tracking

* add cost tracking for tavily tiers

* add search to call types

* add search_provider_cost_per_query

* add cost tracking for search APIs

* add cost tracking search APIs

* docs cost tracking search

* docs search

* fix linting
This commit is contained in:
Ishaan Jaff
2025-10-22 17:29:09 -07:00
committed by GitHub
parent 143e314dda
commit 3e4b5ef3a5
20 changed files with 972 additions and 764 deletions

View File

@@ -1,753 +0,0 @@
# /search
| Feature | Supported |
|---------|-----------|
| Supported Providers | `perplexity`, `tavily`, `parallel_ai`, `exa_ai`, `google_pse`, `dataforseo` |
| Cost Tracking | ❌ |
| Logging | ✅ |
| Load Balancing | ❌ |
:::tip
LiteLLM follows the [Perplexity API request/response for the Search API](https://docs.perplexity.ai/api-reference/search-post)
:::
:::info
Supported from LiteLLM v1.78.7+
:::
## **LiteLLM Python SDK Usage**
### Quick Start
```python showLineNumbers title="Basic Search"
from litellm import search
import os
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
response = search(
query="latest AI developments in 2024",
search_provider="perplexity",
max_results=5
)
# Access search results
for result in response.results:
print(f"{result.title}: {result.url}")
print(f"Snippet: {result.snippet}\n")
```
### Async Usage
```python showLineNumbers title="Async Search"
from litellm import asearch
import os, asyncio
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
async def search_async():
response = await asearch(
query="machine learning research papers",
search_provider="perplexity",
max_results=10,
search_domain_filter=["arxiv.org", "nature.com"]
)
# Access search results
for result in response.results:
print(f"{result.title}: {result.url}")
print(f"Snippet: {result.snippet}")
asyncio.run(search_async())
```
### Optional Parameters
```python showLineNumbers title="Search with Options"
response = search(
query="AI developments",
search_provider="perplexity",
# Unified parameters (work across all providers)
max_results=10, # Maximum number of results (1-20)
search_domain_filter=["arxiv.org"], # Filter to specific domains
country="US", # Country code filter
max_tokens_per_page=1024 # Max tokens per page
)
```
## **LiteLLM AI Gateway Usage**
LiteLLM provides a Perplexity API compatible `/search` endpoint for search calls.
**Setup**
Add this to your litellm proxy config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
```
Start litellm
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### Test Request
**Option 1: Search tool name in URL (Recommended - keeps body Perplexity-compatible)**
```bash showLineNumbers title="cURL Request"
curl http://0.0.0.0:4000/v1/search/perplexity-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments 2024",
"max_results": 5,
"search_domain_filter": ["arxiv.org", "nature.com"],
"country": "US"
}'
```
**Option 2: Search tool name in body**
```bash showLineNumbers title="cURL Request with search_tool_name in body"
curl http://0.0.0.0:4000/v1/search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"search_tool_name": "perplexity-search",
"query": "latest AI developments 2024",
"max_results": 5
}'
```
### Load Balancing
Configure multiple search providers for automatic load balancing and fallbacks:
```yaml showLineNumbers title="config.yaml with load balancing"
search_tools:
- search_tool_name: my-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
- search_tool_name: my-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
- search_tool_name: my-search
litellm_params:
search_provider: exa_ai
api_key: os.environ/EXA_API_KEY
router_settings:
routing_strategy: simple-shuffle # or 'least-busy', 'latency-based-routing'
```
Test with load balancing:
```bash
curl http://0.0.0.0:4000/v1/search/my-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "AI developments",
"max_results": 10
}'
```
## **Request/Response Format**
:::info
LiteLLM follows the **Perplexity Search API specification**.
See the [official Perplexity Search documentation](https://docs.perplexity.ai/api-reference/search-post) for complete details.
:::
### Example Request
```json showLineNumbers title="Search Request"
{
"query": "latest AI developments 2024",
"max_results": 10,
"search_domain_filter": ["arxiv.org", "nature.com"],
"country": "US",
"max_tokens_per_page": 1024
}
```
### Request Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `query` | string or array | Yes | Search query. Can be a single string or array of strings |
| `search_provider` | string | Yes (SDK) | The search provider to use: `"perplexity"`, `"tavily"`, `"parallel_ai"`, `"exa_ai"`, or `"google_pse"` |
| `search_tool_name` | string | Yes (Proxy) | Name of the search tool configured in `config.yaml` |
| `max_results` | integer | No | Maximum number of results to return (1-20). Default: 10 |
| `search_domain_filter` | array | No | List of domains to filter results (max 20 domains) |
| `max_tokens_per_page` | integer | No | Maximum tokens per page to process. Default: 1024 |
| `country` | string | No | Country code filter (e.g., `"US"`, `"GB"`, `"DE"`) |
**Query Format Examples:**
```python
# Single query
query = "AI developments"
# Multiple queries
query = ["AI developments", "machine learning trends"]
```
### Response Format
The response follows Perplexity's search format with the following structure:
```json showLineNumbers title="Search Response"
{
"object": "search",
"results": [
{
"title": "Latest Advances in Artificial Intelligence",
"url": "https://arxiv.org/paper/example",
"snippet": "This paper discusses recent developments in AI...",
"date": "2024-01-15"
},
{
"title": "Machine Learning Breakthroughs",
"url": "https://nature.com/articles/ml-breakthrough",
"snippet": "Researchers have achieved new milestones...",
"date": "2024-01-10"
}
]
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `object` | string | Always `"search"` for search responses |
| `results` | array | List of search results |
| `results[].title` | string | Title of the search result |
| `results[].url` | string | URL of the search result |
| `results[].snippet` | string | Text snippet from the result |
| `results[].date` | string | Optional publication or last updated date |
## **Supported Providers**
| Provider | Environment Variable | `search_provider` Value |
|----------|---------------------|------------------------|
| Perplexity AI | `PERPLEXITYAI_API_KEY` | `perplexity` |
| Tavily | `TAVILY_API_KEY` | `tavily` |
| Exa AI | `EXA_API_KEY` | `exa_ai` |
| Parallel AI | `PARALLEL_AI_API_KEY` | `parallel_ai` |
| Google PSE | `GOOGLE_PSE_API_KEY`, `GOOGLE_PSE_ENGINE_ID` | `google_pse` |
| DataForSEO | `DATAFORSEO_LOGIN`, `DATAFORSEO_PASSWORD` | `dataforseo` |
### Perplexity AI
**Get API Key:** [https://www.perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
#### LiteLLM Python SDK
```python showLineNumbers title="Perplexity Search"
import os
from litellm import search
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
response = search(
query="latest AI developments",
search_provider="perplexity",
max_results=5
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/perplexity-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
### Tavily
**Get API Key:** [https://tavily.com](https://tavily.com)
#### LiteLLM Python SDK
```python showLineNumbers title="Tavily Search"
import os
from litellm import search
os.environ["TAVILY_API_KEY"] = "tvly-..."
response = search(
query="latest AI developments",
search_provider="tavily",
max_results=5
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/tavily-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
### Exa AI
**Get API Key:** [https://exa.ai](https://exa.ai)
#### LiteLLM Python SDK
```python showLineNumbers title="Exa AI Search"
import os
from litellm import search
os.environ["EXA_API_KEY"] = "exa-..."
response = search(
query="latest AI developments",
search_provider="exa_ai",
max_results=5
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: exa-search
litellm_params:
search_provider: exa_ai
api_key: os.environ/EXA_API_KEY
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/exa-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
### Parallel AI
**Get API Key:** [https://www.parallel.ai](https://www.parallel.ai)
#### LiteLLM Python SDK
```python showLineNumbers title="Parallel AI Search"
import os
from litellm import search
os.environ["PARALLEL_AI_API_KEY"] = "..."
response = search(
query="latest AI developments",
search_provider="parallel_ai",
max_results=5
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: parallel-search
litellm_params:
search_provider: parallel_ai
api_key: os.environ/PARALLEL_AI_API_KEY
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/parallel-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
### Google Programmable Search Engine (PSE)
**Get API Key:** [Google Cloud Console](https://console.cloud.google.com/apis/credentials)
**Create Search Engine:** [Programmable Search Engine](https://programmablesearchengine.google.com/)
#### Setup
1. Go to [Google Developers Programmable Search Engine](https://programmablesearchengine.google.com/) and log in or create an account
2. Click the **Add** button in the control panel
3. Enter a search engine name and configure properties:
- Choose which sites to search (entire web or specific sites)
- Set language and other preferences
- Verify you're not a robot
4. Click **Create** button
5. Once created, you'll see:
- **Search engine ID (cx)** - Copy this for `GOOGLE_PSE_ENGINE_ID`
- Instructions to get your API key
6. Generate API key:
- Go to [Google Cloud Console - Credentials](https://console.cloud.google.com/apis/credentials)
- Create a new API key or use existing one
- Enable **Custom Search API** for your project
- Copy the API key for `GOOGLE_PSE_API_KEY`
#### LiteLLM Python SDK
```python showLineNumbers title="Google PSE Search"
import os
from litellm import search
os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
response = search(
query="latest AI developments",
search_provider="google_pse",
max_results=10
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: google-search
litellm_params:
search_provider: google_pse
api_key: os.environ/GOOGLE_PSE_API_KEY
search_engine_id: os.environ/GOOGLE_PSE_ENGINE_ID
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/google-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 10
}'
```
### DataForSEO
**Get API Access:** [DataForSEO](https://dataforseo.com/)
#### Setup
1. Go to [DataForSEO](https://dataforseo.com/) and create an account
2. Navigate to your account dashboard
3. Generate API credentials:
- You'll receive a **login** (username)
- You'll receive a **password**
4. Set up your environment variables:
- `DATAFORSEO_LOGIN` - Your DataForSEO login/username
- `DATAFORSEO_PASSWORD` - Your DataForSEO password
#### LiteLLM Python SDK
```python showLineNumbers title="DataForSEO Search"
import os
from litellm import search
os.environ["DATAFORSEO_LOGIN"] = "your-login"
os.environ["DATAFORSEO_PASSWORD"] = "your-password"
response = search(
query="latest AI developments",
search_provider="dataforseo",
max_results=10
)
```
#### LiteLLM AI Gateway
**1. Setup config.yaml**
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: dataforseo-search
litellm_params:
search_provider: dataforseo
api_key: "os.environ/DATAFORSEO_LOGIN:os.environ/DATAFORSEO_PASSWORD"
```
**2. Start the proxy**
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
**3. Test the search endpoint**
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/dataforseo-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 10
}'
```
## Provider-specific parameters
Sending provider-specific parameters is supported for all providers, you just need to pass them in the request body.
#### Tavily Search
```python showLineNumbers title="Tavily Search"
import os
from litellm import search
os.environ["TAVILY_API_KEY"] = "tvly-..."
response = search(
query="latest tech news",
search_provider="tavily",
max_results=5,
# Tavily-specific parameters
topic="news", # 'general', 'news', 'finance'
search_depth="advanced", # 'basic', 'advanced'
include_answer=True, # Include AI-generated answer
include_raw_content=True # Include raw HTML content
)
```
#### Exa AI Search
```python showLineNumbers title="Exa AI Search"
import os
from litellm import search
os.environ["EXA_API_KEY"] = "exa-..."
response = search(
query="AI research papers",
search_provider="exa_ai",
max_results=10,
search_domain_filter=["arxiv.org"],
# Exa-specific parameters
type="neural", # 'neural', 'keyword', or 'auto'
contents={"text": True}, # Request text content
use_autoprompt=True # Enable Exa's autoprompt
)
```
#### Parallel AI Search
```python showLineNumbers title="Parallel AI Search"
import os
from litellm import search
os.environ["PARALLEL_AI_API_KEY"] = "..."
response = search(
query="latest developments in quantum computing",
search_provider="parallel_ai",
max_results=5,
# Parallel AI-specific parameters
processor="pro", # 'base' or 'pro'
max_chars_per_result=500 # Max characters per result
)
```
#### Google PSE Search
```python showLineNumbers title="Google PSE Search"
import os
from litellm import search
os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
response = search(
query="latest AI research papers",
search_provider="google_pse",
max_results=10,
search_domain_filter=["arxiv.org"],
# Google PSE-specific parameters (use actual Google PSE API parameter names)
dateRestrict="m6", # 'm6' = last 6 months, 'd7' = last 7 days
lr="lang_en", # Language restriction (e.g., 'lang_en', 'lang_es')
safe="active", # Search safety level ('active' or 'off')
exactTerms="machine learning", # Phrase that all documents must contain
fileType="pdf" # File type to restrict results to
)
```
#### DataForSEO Search
```python showLineNumbers title="DataForSEO Search"
import os
from litellm import search
os.environ["DATAFORSEO_LOGIN"] = "your-login"
os.environ["DATAFORSEO_PASSWORD"] = "your-password"
response = search(
query="AI developments",
search_provider="dataforseo",
max_results=10,
# DataForSEO-specific parameters
country="United States", # Country name for location_name
language_code="en", # Language code
depth=20, # Number of results (max 700)
device="desktop", # Device type ('desktop', 'mobile', 'tablet')
os="windows" # Operating system
)
```

View File

@@ -0,0 +1,91 @@
# DataForSEO Search
**Get API Access:** [DataForSEO](https://dataforseo.com/)
## Setup
1. Go to [DataForSEO](https://dataforseo.com/) and create an account
2. Navigate to your account dashboard
3. Generate API credentials:
- You'll receive a **login** (username)
- You'll receive a **password**
4. Set up your environment variables:
- `DATAFORSEO_LOGIN` - Your DataForSEO login/username
- `DATAFORSEO_PASSWORD` - Your DataForSEO password
## LiteLLM Python SDK
```python showLineNumbers title="DataForSEO Search"
import os
from litellm import search
os.environ["DATAFORSEO_LOGIN"] = "your-login"
os.environ["DATAFORSEO_PASSWORD"] = "your-password"
response = search(
query="latest AI developments",
search_provider="dataforseo",
max_results=10
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: dataforseo-search
litellm_params:
search_provider: dataforseo
api_key: "os.environ/DATAFORSEO_LOGIN:os.environ/DATAFORSEO_PASSWORD"
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/dataforseo-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 10
}'
```
## Provider-specific Parameters
```python showLineNumbers title="DataForSEO Search with Provider-specific Parameters"
import os
from litellm import search
os.environ["DATAFORSEO_LOGIN"] = "your-login"
os.environ["DATAFORSEO_PASSWORD"] = "your-password"
response = search(
query="AI developments",
search_provider="dataforseo",
max_results=10,
# DataForSEO-specific parameters
country="United States", # Country name for location_name
language_code="en", # Language code
depth=20, # Number of results (max 700)
device="desktop", # Device type ('desktop', 'mobile', 'tablet')
os="windows" # Operating system
)
```

View File

@@ -0,0 +1,77 @@
# Exa AI Search
**Get API Key:** [https://exa.ai](https://exa.ai)
## LiteLLM Python SDK
```python showLineNumbers title="Exa AI Search"
import os
from litellm import search
os.environ["EXA_API_KEY"] = "exa-..."
response = search(
query="latest AI developments",
search_provider="exa_ai",
max_results=5
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: exa-search
litellm_params:
search_provider: exa_ai
api_key: os.environ/EXA_API_KEY
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/exa-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
## Provider-specific Parameters
```python showLineNumbers title="Exa AI Search with Provider-specific Parameters"
import os
from litellm import search
os.environ["EXA_API_KEY"] = "exa-..."
response = search(
query="AI research papers",
search_provider="exa_ai",
max_results=10,
search_domain_filter=["arxiv.org"],
# Exa-specific parameters
type="neural", # 'neural', 'keyword', or 'auto'
contents={"text": True}, # Request text content
use_autoprompt=True # Enable Exa's autoprompt
)
```

View File

@@ -0,0 +1,101 @@
# Google Programmable Search Engine (PSE)
**Get API Key:** [Google Cloud Console](https://console.cloud.google.com/apis/credentials)
**Create Search Engine:** [Programmable Search Engine](https://programmablesearchengine.google.com/)
## Setup
1. Go to [Google Developers Programmable Search Engine](https://programmablesearchengine.google.com/) and log in or create an account
2. Click the **Add** button in the control panel
3. Enter a search engine name and configure properties:
- Choose which sites to search (entire web or specific sites)
- Set language and other preferences
- Verify you're not a robot
4. Click **Create** button
5. Once created, you'll see:
- **Search engine ID (cx)** - Copy this for `GOOGLE_PSE_ENGINE_ID`
- Instructions to get your API key
6. Generate API key:
- Go to [Google Cloud Console - Credentials](https://console.cloud.google.com/apis/credentials)
- Create a new API key or use existing one
- Enable **Custom Search API** for your project
- Copy the API key for `GOOGLE_PSE_API_KEY`
## LiteLLM Python SDK
```python showLineNumbers title="Google PSE Search"
import os
from litellm import search
os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
response = search(
query="latest AI developments",
search_provider="google_pse",
max_results=10
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: google-search
litellm_params:
search_provider: google_pse
api_key: os.environ/GOOGLE_PSE_API_KEY
search_engine_id: os.environ/GOOGLE_PSE_ENGINE_ID
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/google-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 10
}'
```
## Provider-specific Parameters
```python showLineNumbers title="Google PSE Search with Provider-specific Parameters"
import os
from litellm import search
os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
response = search(
query="latest AI research papers",
search_provider="google_pse",
max_results=10,
search_domain_filter=["arxiv.org"],
# Google PSE-specific parameters (use actual Google PSE API parameter names)
dateRestrict="m6", # 'm6' = last 6 months, 'd7' = last 7 days
lr="lang_en", # Language restriction (e.g., 'lang_en', 'lang_es')
safe="active", # Search safety level ('active' or 'off')
exactTerms="machine learning", # Phrase that all documents must contain
fileType="pdf" # File type to restrict results to
)
```

View File

@@ -0,0 +1,272 @@
# Overview
| Feature | Supported |
|---------|-----------|
| Supported Providers | `perplexity`, `tavily`, `parallel_ai`, `exa_ai`, `google_pse`, `dataforseo` |
| Cost Tracking | ✅ |
| Logging | ✅ |
| Load Balancing | ❌ |
:::tip
LiteLLM follows the [Perplexity API request/response for the Search API](https://docs.perplexity.ai/api-reference/search-post)
:::
:::info
Supported from LiteLLM v1.78.7+
:::
## **LiteLLM Python SDK Usage**
### Quick Start
```python showLineNumbers title="Basic Search"
from litellm import search
import os
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
response = search(
query="latest AI developments in 2024",
search_provider="perplexity",
max_results=5
)
# Access search results
for result in response.results:
print(f"{result.title}: {result.url}")
print(f"Snippet: {result.snippet}\n")
```
### Async Usage
```python showLineNumbers title="Async Search"
from litellm import asearch
import os, asyncio
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
async def search_async():
response = await asearch(
query="machine learning research papers",
search_provider="perplexity",
max_results=10,
search_domain_filter=["arxiv.org", "nature.com"]
)
# Access search results
for result in response.results:
print(f"{result.title}: {result.url}")
print(f"Snippet: {result.snippet}")
asyncio.run(search_async())
```
### Optional Parameters
```python showLineNumbers title="Search with Options"
response = search(
query="AI developments",
search_provider="perplexity",
# Unified parameters (work across all providers)
max_results=10, # Maximum number of results (1-20)
search_domain_filter=["arxiv.org"], # Filter to specific domains
country="US", # Country code filter
max_tokens_per_page=1024 # Max tokens per page
)
```
## **LiteLLM AI Gateway Usage**
LiteLLM provides a Perplexity API compatible `/search` endpoint for search calls.
**Setup**
Add this to your litellm proxy config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
```
Start litellm
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### Test Request
**Option 1: Search tool name in URL (Recommended - keeps body Perplexity-compatible)**
```bash showLineNumbers title="cURL Request"
curl http://0.0.0.0:4000/v1/search/perplexity-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments 2024",
"max_results": 5,
"search_domain_filter": ["arxiv.org", "nature.com"],
"country": "US"
}'
```
**Option 2: Search tool name in body**
```bash showLineNumbers title="cURL Request with search_tool_name in body"
curl http://0.0.0.0:4000/v1/search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"search_tool_name": "perplexity-search",
"query": "latest AI developments 2024",
"max_results": 5
}'
```
### Load Balancing
Configure multiple search providers for automatic load balancing and fallbacks:
```yaml showLineNumbers title="config.yaml with load balancing"
search_tools:
- search_tool_name: my-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
- search_tool_name: my-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
- search_tool_name: my-search
litellm_params:
search_provider: exa_ai
api_key: os.environ/EXA_API_KEY
router_settings:
routing_strategy: simple-shuffle # or 'least-busy', 'latency-based-routing'
```
Test with load balancing:
```bash
curl http://0.0.0.0:4000/v1/search/my-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "AI developments",
"max_results": 10
}'
```
## **Request/Response Format**
:::info
LiteLLM follows the **Perplexity Search API specification**.
See the [official Perplexity Search documentation](https://docs.perplexity.ai/api-reference/search-post) for complete details.
:::
### Example Request
```json showLineNumbers title="Search Request"
{
"query": "latest AI developments 2024",
"max_results": 10,
"search_domain_filter": ["arxiv.org", "nature.com"],
"country": "US",
"max_tokens_per_page": 1024
}
```
### Request Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `query` | string or array | Yes | Search query. Can be a single string or array of strings |
| `search_provider` | string | Yes (SDK) | The search provider to use: `"perplexity"`, `"tavily"`, `"parallel_ai"`, `"exa_ai"`, or `"google_pse"` |
| `search_tool_name` | string | Yes (Proxy) | Name of the search tool configured in `config.yaml` |
| `max_results` | integer | No | Maximum number of results to return (1-20). Default: 10 |
| `search_domain_filter` | array | No | List of domains to filter results (max 20 domains) |
| `max_tokens_per_page` | integer | No | Maximum tokens per page to process. Default: 1024 |
| `country` | string | No | Country code filter (e.g., `"US"`, `"GB"`, `"DE"`) |
**Query Format Examples:**
```python
# Single query
query = "AI developments"
# Multiple queries
query = ["AI developments", "machine learning trends"]
```
### Response Format
The response follows Perplexity's search format with the following structure:
```json showLineNumbers title="Search Response"
{
"object": "search",
"results": [
{
"title": "Latest Advances in Artificial Intelligence",
"url": "https://arxiv.org/paper/example",
"snippet": "This paper discusses recent developments in AI...",
"date": "2024-01-15"
},
{
"title": "Machine Learning Breakthroughs",
"url": "https://nature.com/articles/ml-breakthrough",
"snippet": "Researchers have achieved new milestones...",
"date": "2024-01-10"
}
]
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `object` | string | Always `"search"` for search responses |
| `results` | array | List of search results |
| `results[].title` | string | Title of the search result |
| `results[].url` | string | URL of the search result |
| `results[].snippet` | string | Text snippet from the result |
| `results[].date` | string | Optional publication or last updated date |
## **Supported Providers**
| Provider | Environment Variable | `search_provider` Value |
|----------|---------------------|------------------------|
| Perplexity AI | `PERPLEXITYAI_API_KEY` | `perplexity` |
| Tavily | `TAVILY_API_KEY` | `tavily` |
| Exa AI | `EXA_API_KEY` | `exa_ai` |
| Parallel AI | `PARALLEL_AI_API_KEY` | `parallel_ai` |
| Google PSE | `GOOGLE_PSE_API_KEY`, `GOOGLE_PSE_ENGINE_ID` | `google_pse` |
| DataForSEO | `DATAFORSEO_LOGIN`, `DATAFORSEO_PASSWORD` | `dataforseo` |
See the individual provider documentation for detailed setup instructions and provider-specific parameters.

View File

@@ -0,0 +1,75 @@
# Parallel AI Search
**Get API Key:** [https://www.parallel.ai](https://www.parallel.ai)
## LiteLLM Python SDK
```python showLineNumbers title="Parallel AI Search"
import os
from litellm import search
os.environ["PARALLEL_AI_API_KEY"] = "..."
response = search(
query="latest AI developments",
search_provider="parallel_ai",
max_results=5
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: parallel-search
litellm_params:
search_provider: parallel_ai
api_key: os.environ/PARALLEL_AI_API_KEY
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/parallel-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
## Provider-specific Parameters
```python showLineNumbers title="Parallel AI Search with Provider-specific Parameters"
import os
from litellm import search
os.environ["PARALLEL_AI_API_KEY"] = "..."
response = search(
query="latest developments in quantum computing",
search_provider="parallel_ai",
max_results=5,
# Parallel AI-specific parameters
processor="pro", # 'base' or 'pro'
max_chars_per_result=500 # Max characters per result
)
```

View File

@@ -0,0 +1,57 @@
# Perplexity AI Search
**Get API Key:** [https://www.perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
## LiteLLM Python SDK
```python showLineNumbers title="Perplexity Search"
import os
from litellm import search
os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
response = search(
query="latest AI developments",
search_provider="perplexity",
max_results=5
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITYAI_API_KEY
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/perplexity-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```

View File

@@ -0,0 +1,77 @@
# Tavily Search
**Get API Key:** [https://tavily.com](https://tavily.com)
## LiteLLM Python SDK
```python showLineNumbers title="Tavily Search"
import os
from litellm import search
os.environ["TAVILY_API_KEY"] = "tvly-..."
response = search(
query="latest AI developments",
search_provider="tavily",
max_results=5
)
```
## LiteLLM AI Gateway
### 1. Setup config.yaml
```yaml showLineNumbers title="config.yaml"
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
search_tools:
- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY
```
### 2. Start the proxy
```bash
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
```
### 3. Test the search endpoint
```bash showLineNumbers title="Test Request"
curl http://0.0.0.0:4000/v1/search/tavily-search \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "latest AI developments",
"max_results": 5
}'
```
## Provider-specific Parameters
```python showLineNumbers title="Tavily Search with Provider-specific Parameters"
import os
from litellm import search
os.environ["TAVILY_API_KEY"] = "tvly-..."
response = search(
query="latest tech news",
search_provider="tavily",
max_results=5,
# Tavily-specific parameters
topic="news", # 'general', 'news', 'finance'
search_depth="advanced", # 'basic', 'advanced'
include_answer=True, # Include AI-generated answer
include_raw_content=True # Include raw HTML content
)
```

View File

@@ -381,7 +381,19 @@ const sidebars = {
"realtime",
"rerank",
"response_api",
"search",
{
type: "category",
label: "/search",
items: [
"search/index",
"search/perplexity",
"search/tavily",
"search/exa_ai",
"search/parallel_ai",
"search/google_pse",
"search/dataforseo",
]
},
{
type: "category",
label: "/vector_stores",

View File

@@ -29,6 +29,7 @@ from litellm.llms.anthropic.cost_calculation import (
from litellm.llms.azure.cost_calculation import (
cost_per_token as azure_openai_cost_per_token,
)
from litellm.llms.base_llm.search.transformation import SearchResponse
from litellm.llms.bedrock.cost_calculation import (
cost_per_token as bedrock_cost_per_token,
)
@@ -315,6 +316,16 @@ def cost_per_token( # noqa: PLR0915
custom_llm_provider=custom_llm_provider,
duration=audio_transcription_file_duration,
)
elif call_type == "search" or call_type == "asearch":
# Search providers use per-query pricing
from litellm.search import search_provider_cost_per_query
return search_provider_cost_per_query(
model=model,
custom_llm_provider=custom_llm_provider,
number_of_queries=number_of_queries or 1,
optional_params=response._hidden_params if response and hasattr(response, "_hidden_params") else None
)
elif custom_llm_provider == "vertex_ai":
cost_router = google_cost_router(
model=model_without_prefix,
@@ -1094,6 +1105,7 @@ def response_cost_calculator(
LiteLLMRealtimeStreamLoggingObject,
OpenAIModerationResponse,
Response,
SearchResponse,
],
model: str,
custom_llm_provider: Optional[str],
@@ -1114,6 +1126,8 @@ def response_cost_calculator(
"speech",
"rerank",
"arerank",
"search",
"asearch",
],
optional_params: dict,
cache_hit: Optional[bool] = None,

View File

@@ -67,7 +67,7 @@ class ExaAISearchConfig(BaseSearchConfig):
self,
api_base: Optional[str],
optional_params: dict,
data: Optional[dict] = None,
data: Optional[Union[Dict, List[Dict]]] = None,
**kwargs,
) -> str:
"""

View File

@@ -90,7 +90,7 @@ class GooglePSESearchConfig(BaseSearchConfig):
self,
api_base: Optional[str],
optional_params: dict,
data: Optional[dict] = None,
data: Optional[Union[Dict, List[Dict]]] = None,
**kwargs,
) -> str:
"""
@@ -104,7 +104,7 @@ class GooglePSESearchConfig(BaseSearchConfig):
api_base = api_base or get_secret_str("GOOGLE_PSE_API_BASE") or self.GOOGLE_PSE_API_BASE
# Build query parameters from the transformed request body
if data and "_google_pse_params" in data:
if data and isinstance(data, dict) and "_google_pse_params" in data:
params = data["_google_pse_params"]
query_string = urlencode(params)
return f"{api_base}?{query_string}"

View File

@@ -66,7 +66,7 @@ class TavilySearchConfig(BaseSearchConfig):
self,
api_base: Optional[str],
optional_params: dict,
data: Optional[dict] = None,
data: Optional[Union[Dict, List[Dict]]] = None,
**kwargs,
) -> str:
"""

View File

@@ -12,7 +12,7 @@
"max_input_tokens": "max input tokens, if the provider specifies it. if not default to max_tokens",
"max_output_tokens": "max output tokens, if the provider specifies it. if not default to max_tokens",
"max_tokens": "LEGACY parameter. set to max_output_tokens if provider specifies it. IF not set to max_input_tokens, if provider specifies it.",
"mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank",
"mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank, search",
"output_cost_per_reasoning_token": 0.0,
"output_cost_per_token": 0.0,
"search_context_cost_per_query": {
@@ -6460,6 +6460,11 @@
"source": "https://www.databricks.com/product/pricing/foundation-model-serving",
"supports_tool_choice": true
},
"dataforseo/search": {
"input_cost_per_query": 0.003,
"litellm_provider": "dataforseo",
"mode": "search"
},
"davinci-002": {
"input_cost_per_token": 2e-06,
"litellm_provider": "text-completion-openai",
@@ -7800,6 +7805,31 @@
"output_cost_per_token": 0.0,
"output_vector_size": 2560
},
"exa_ai/search": {
"litellm_provider": "exa_ai",
"mode": "search",
"tiered_pricing": [
{
"input_cost_per_query": 5e-03,
"max_results_range": [
0,
25
]
},
{
"input_cost_per_query": 25e-03,
"max_results_range": [
26,
100
]
}
]
},
"perplexity/search": {
"input_cost_per_query": 5e-03,
"litellm_provider": "perplexity",
"mode": "search"
},
"elevenlabs/scribe_v1": {
"input_cost_per_second": 6.11e-05,
"litellm_provider": "elevenlabs",
@@ -12211,6 +12241,11 @@
"video"
]
},
"google_pse/search": {
"input_cost_per_query": 0.005,
"litellm_provider": "google_pse",
"mode": "search"
},
"global.anthropic.claude-sonnet-4-5-20250929-v1:0": {
"cache_creation_input_token_cost": 3.75e-06,
"cache_read_input_token_cost": 3e-07,
@@ -18802,6 +18837,16 @@
"output_cost_per_token": 1.25e-07,
"source": "https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#foundation_models"
},
"parallel_ai/search": {
"input_cost_per_query": 0.004,
"litellm_provider": "parallel_ai",
"mode": "search"
},
"parallel_ai/search-pro": {
"input_cost_per_query": 0.009,
"litellm_provider": "parallel_ai",
"mode": "search"
},
"perplexity/codellama-34b-instruct": {
"input_cost_per_token": 3.5e-07,
"litellm_provider": "perplexity",
@@ -19812,6 +19857,16 @@
"mode": "image_generation",
"output_cost_per_pixel": 0.0
},
"tavily/search": {
"input_cost_per_query": 0.008,
"litellm_provider": "tavily",
"mode": "search"
},
"tavily/search-advanced": {
"input_cost_per_query": 0.016,
"litellm_provider": "tavily",
"mode": "search"
},
"text-bison": {
"input_cost_per_character": 2.5e-07,
"litellm_provider": "vertex_ai-text-models",

View File

@@ -1,7 +1,8 @@
"""
LiteLLM Search API module.
"""
from litellm.search.cost_calculator import search_provider_cost_per_query
from litellm.search.main import asearch, search
__all__ = ["search", "asearch"]
__all__ = ["search", "asearch", "search_provider_cost_per_query"]

View File

@@ -0,0 +1,52 @@
"""
Cost calculation for search providers.
"""
from typing import Optional, Tuple
from litellm.utils import get_model_info
def search_provider_cost_per_query(
model: str,
custom_llm_provider: Optional[str] = None,
number_of_queries: int = 1,
optional_params: Optional[dict] = None,
) -> Tuple[float, float]:
"""
Calculate cost for search-only providers.
Returns (input_cost, output_cost) where input_cost = queries * cost_per_query
Supports tiered pricing based on max_results parameter.
Args:
model: Model name (e.g., "exa_ai/search", "tavily/search")
custom_llm_provider: Provider name (e.g., "exa_ai", "tavily")
number_of_queries: Number of search queries performed (default: 1)
optional_params: Optional parameters including max_results for tiered pricing
Returns:
Tuple of (input_cost, output_cost) where output_cost is always 0.0
"""
model_info = get_model_info(model=model, custom_llm_provider=custom_llm_provider)
# Check for tiered pricing (e.g., Exa AI based on max_results)
tiered_pricing = model_info.get("tiered_pricing")
if tiered_pricing and isinstance(tiered_pricing, list):
max_results = (optional_params or {}).get("max_results", 10) # default 10 results
cost_per_query = 0.0
for tier in tiered_pricing:
range_min, range_max = tier["max_results_range"]
if range_min <= max_results <= range_max:
cost_per_query = tier["input_cost_per_query"]
break
else:
# Fallback to highest tier if out of range
cost_per_query = tiered_pricing[-1]["input_cost_per_query"]
else:
# Simple flat rate
cost_per_query = float(model_info.get("input_cost_per_query") or 0.0)
total_cost = number_of_queries * cost_per_query
return (total_cost, 0.0) # (input_cost, output_cost)

View File

@@ -149,8 +149,9 @@ async def asearch(
return response
except Exception as e:
model_name = f"{search_provider}/search"
raise litellm.exception_type(
model="",
model=model_name,
custom_llm_provider=search_provider,
original_exception=e,
completion_kwargs=local_vars,
@@ -287,8 +288,9 @@ def search(
)
# Pre Call logging
model_name = f"{search_provider}/search"
litellm_logging_obj.update_environment_variables(
model="",
model=model_name,
optional_params=optional_params,
litellm_params={
"litellm_call_id": litellm_call_id,
@@ -313,8 +315,9 @@ def search(
return response
except Exception as e:
model_name = f"{search_provider}/search"
raise litellm.exception_type(
model="",
model=model_name,
custom_llm_provider=search_provider,
original_exception=e,
completion_kwargs=local_vars,

View File

@@ -239,6 +239,8 @@ class CallTypes(str, Enum):
speech = "speech"
rerank = "rerank"
arerank = "arerank"
search = "search"
asearch = "asearch"
arealtime = "_arealtime"
create_batch = "create_batch"
acreate_batch = "acreate_batch"
@@ -321,6 +323,8 @@ CallTypesLiteral = Literal[
"speech",
"rerank",
"arerank",
"search",
"asearch",
"_arealtime",
"create_batch",
"acreate_batch",

View File

@@ -12,7 +12,7 @@
"max_input_tokens": "max input tokens, if the provider specifies it. if not default to max_tokens",
"max_output_tokens": "max output tokens, if the provider specifies it. if not default to max_tokens",
"max_tokens": "LEGACY parameter. set to max_output_tokens if provider specifies it. IF not set to max_input_tokens, if provider specifies it.",
"mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank",
"mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank, search",
"output_cost_per_reasoning_token": 0.0,
"output_cost_per_token": 0.0,
"search_context_cost_per_query": {
@@ -6460,6 +6460,11 @@
"source": "https://www.databricks.com/product/pricing/foundation-model-serving",
"supports_tool_choice": true
},
"dataforseo/search": {
"input_cost_per_query": 0.003,
"litellm_provider": "dataforseo",
"mode": "search"
},
"davinci-002": {
"input_cost_per_token": 2e-06,
"litellm_provider": "text-completion-openai",
@@ -7800,6 +7805,31 @@
"output_cost_per_token": 0.0,
"output_vector_size": 2560
},
"exa_ai/search": {
"litellm_provider": "exa_ai",
"mode": "search",
"tiered_pricing": [
{
"input_cost_per_query": 5e-03,
"max_results_range": [
0,
25
]
},
{
"input_cost_per_query": 25e-03,
"max_results_range": [
26,
100
]
}
]
},
"perplexity/search": {
"input_cost_per_query": 5e-03,
"litellm_provider": "perplexity",
"mode": "search"
},
"elevenlabs/scribe_v1": {
"input_cost_per_second": 6.11e-05,
"litellm_provider": "elevenlabs",
@@ -12211,6 +12241,11 @@
"video"
]
},
"google_pse/search": {
"input_cost_per_query": 0.005,
"litellm_provider": "google_pse",
"mode": "search"
},
"global.anthropic.claude-sonnet-4-5-20250929-v1:0": {
"cache_creation_input_token_cost": 3.75e-06,
"cache_read_input_token_cost": 3e-07,
@@ -18802,6 +18837,16 @@
"output_cost_per_token": 1.25e-07,
"source": "https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#foundation_models"
},
"parallel_ai/search": {
"input_cost_per_query": 0.004,
"litellm_provider": "parallel_ai",
"mode": "search"
},
"parallel_ai/search-pro": {
"input_cost_per_query": 0.009,
"litellm_provider": "parallel_ai",
"mode": "search"
},
"perplexity/codellama-34b-instruct": {
"input_cost_per_token": 3.5e-07,
"litellm_provider": "perplexity",
@@ -19812,6 +19857,16 @@
"mode": "image_generation",
"output_cost_per_pixel": 0.0
},
"tavily/search": {
"input_cost_per_query": 0.008,
"litellm_provider": "tavily",
"mode": "search"
},
"tavily/search-advanced": {
"input_cost_per_query": 0.016,
"litellm_provider": "tavily",
"mode": "search"
},
"text-bison": {
"input_cost_per_character": 2.5e-07,
"litellm_provider": "vertex_ai-text-models",

View File

@@ -6,6 +6,7 @@ This follows the same pattern as BaseOCRTest in tests/ocr_tests/base_ocr_unit_te
import pytest
import litellm
from abc import ABC, abstractmethod
import os
import json
@@ -37,6 +38,8 @@ class BaseSearchTest(ABC):
"""
Test basic search functionality with a simple query.
"""
os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
litellm.model_cost = litellm.get_model_cost_map(url="")
litellm._turn_on_debug()
search_provider = self.get_search_provider()
print("Search Provider=", search_provider)
@@ -77,6 +80,18 @@ class BaseSearchTest(ABC):
assert len(first_result.url) > 0, "URL should not be empty"
assert len(first_result.snippet) > 0, "Snippet should not be empty"
# Validate cost tracking in _hidden_params
assert hasattr(response, "_hidden_params"), "Response should have '_hidden_params' attribute"
hidden_params = response._hidden_params
assert "response_cost" in hidden_params, "_hidden_params should contain 'response_cost'"
response_cost = hidden_params["response_cost"]
assert response_cost is not None, "response_cost should not be None"
assert isinstance(response_cost, (int, float)), "response_cost should be a number"
assert response_cost >= 0, "response_cost should be non-negative"
print(f"Cost tracking: ${response_cost:.6f}")
except Exception as e:
pytest.fail(f"Search call failed: {str(e)}")