[Feat] Add cost tracking for Search API requests - Google PSE, Tavily, Parallel AI, Exa AI (#15821)

* add search cost tracking * add cost tracking for tavily tiers * add search to call types * add search_provider_cost_per_query * add cost tracking for search APIs * add cost tracking search APIs * docs cost tracking search * docs search * fix linting
2025-12-06 11:33:26 +08:00 · 2025-10-22 17:29:09 -07:00
parent 143e314dda
commit 3e4b5ef3a5
20 changed files with 972 additions and 764 deletions
--- a/docs/my-website/docs/search.md
+++ b/docs/my-website/docs/search.md
@@ -1,753 +0,0 @@
-# /search
-
-| Feature | Supported | 
-|---------|-----------|
-| Supported Providers | `perplexity`, `tavily`, `parallel_ai`, `exa_ai`, `google_pse`, `dataforseo` |
-| Cost Tracking | ❌ |
-| Logging | ✅ |
-| Load Balancing | ❌ |
-
-:::tip
-
-LiteLLM follows the [Perplexity API request/response for the Search API](https://docs.perplexity.ai/api-reference/search-post)
-
-:::
-
-:::info
-
-Supported from LiteLLM v1.78.7+
-:::
-
-## **LiteLLM Python SDK Usage**
-### Quick Start 
-
-```python showLineNumbers title="Basic Search"
-from litellm import search
-import os
-
-os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
-
-response = search(
-    query="latest AI developments in 2024",
-    search_provider="perplexity",
-    max_results=5
-)
-
-# Access search results
-for result in response.results:
-    print(f"{result.title}: {result.url}")
-    print(f"Snippet: {result.snippet}\n")
-```
-
-### Async Usage 
-
-```python showLineNumbers title="Async Search"
-from litellm import asearch
-import os, asyncio
-
-os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
-
-async def search_async(): 
-    response = await asearch(
-        query="machine learning research papers",
-        search_provider="perplexity",
-        max_results=10,
-        search_domain_filter=["arxiv.org", "nature.com"]
-    )
-    
-    # Access search results
-    for result in response.results:
-        print(f"{result.title}: {result.url}")
-        print(f"Snippet: {result.snippet}")
-
-asyncio.run(search_async())
-```
-
-### Optional Parameters
-
-```python showLineNumbers title="Search with Options"
-response = search(
-    query="AI developments",
-    search_provider="perplexity",
-    # Unified parameters (work across all providers)
-    max_results=10,                         # Maximum number of results (1-20)
-    search_domain_filter=["arxiv.org"],     # Filter to specific domains
-    country="US",                           # Country code filter
-    max_tokens_per_page=1024                # Max tokens per page
-)
-```
-
-## **LiteLLM AI Gateway Usage**
-
-LiteLLM provides a Perplexity API compatible `/search` endpoint for search calls.
-
-**Setup**
-
-Add this to your litellm proxy config.yaml
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: perplexity-search
-    litellm_params:
-      search_provider: perplexity
-      api_key: os.environ/PERPLEXITYAI_API_KEY
-  
-  - search_tool_name: tavily-search
-    litellm_params:
-      search_provider: tavily
-      api_key: os.environ/TAVILY_API_KEY
-```
-
-Start litellm
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-### Test Request
-
-**Option 1: Search tool name in URL (Recommended - keeps body Perplexity-compatible)**
-
-```bash showLineNumbers title="cURL Request"
-curl http://0.0.0.0:4000/v1/search/perplexity-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments 2024",
-    "max_results": 5,
-    "search_domain_filter": ["arxiv.org", "nature.com"],
-    "country": "US"
-  }'
-```
-
-**Option 2: Search tool name in body**
-
-```bash showLineNumbers title="cURL Request with search_tool_name in body"
-curl http://0.0.0.0:4000/v1/search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "search_tool_name": "perplexity-search",
-    "query": "latest AI developments 2024",
-    "max_results": 5
-  }'
-```
-
-### Load Balancing
-
-Configure multiple search providers for automatic load balancing and fallbacks:
-
-```yaml showLineNumbers title="config.yaml with load balancing"
-search_tools:
-  - search_tool_name: my-search
-    litellm_params:
-      search_provider: perplexity
-      api_key: os.environ/PERPLEXITYAI_API_KEY
-  
-  - search_tool_name: my-search
-    litellm_params:
-      search_provider: tavily
-      api_key: os.environ/TAVILY_API_KEY
-  
-  - search_tool_name: my-search
-    litellm_params:
-      search_provider: exa_ai
-      api_key: os.environ/EXA_API_KEY
-
-router_settings:
-  routing_strategy: simple-shuffle  # or 'least-busy', 'latency-based-routing'
-```
-
-Test with load balancing:
-
-```bash
-curl http://0.0.0.0:4000/v1/search/my-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "AI developments",
-    "max_results": 10
-  }'
-```
-
-## **Request/Response Format**
-
-:::info
-
-LiteLLM follows the **Perplexity Search API specification**. 
-
-See the [official Perplexity Search documentation](https://docs.perplexity.ai/api-reference/search-post) for complete details.
-
-:::
-
-### Example Request
-
-```json showLineNumbers title="Search Request"
-{
-  "query": "latest AI developments 2024",
-  "max_results": 10,
-  "search_domain_filter": ["arxiv.org", "nature.com"],
-  "country": "US",
-  "max_tokens_per_page": 1024
-}
-```
-
-### Request Parameters
-
-| Parameter | Type | Required | Description |
-|-----------|------|----------|-------------|
-| `query` | string or array | Yes | Search query. Can be a single string or array of strings |
-| `search_provider` | string | Yes (SDK) | The search provider to use: `"perplexity"`, `"tavily"`, `"parallel_ai"`, `"exa_ai"`, or `"google_pse"` |
-| `search_tool_name` | string | Yes (Proxy) | Name of the search tool configured in `config.yaml` |
-| `max_results` | integer | No | Maximum number of results to return (1-20). Default: 10 |
-| `search_domain_filter` | array | No | List of domains to filter results (max 20 domains) |
-| `max_tokens_per_page` | integer | No | Maximum tokens per page to process. Default: 1024 |
-| `country` | string | No | Country code filter (e.g., `"US"`, `"GB"`, `"DE"`) |
-
-**Query Format Examples:**
-
-```python
-# Single query
-query = "AI developments"
-
-# Multiple queries
-query = ["AI developments", "machine learning trends"]
-```
-
-### Response Format
-
-The response follows Perplexity's search format with the following structure:
-
-```json showLineNumbers title="Search Response"
-{
-  "object": "search",
-  "results": [
-    {
-      "title": "Latest Advances in Artificial Intelligence",
-      "url": "https://arxiv.org/paper/example",
-      "snippet": "This paper discusses recent developments in AI...",
-      "date": "2024-01-15"
-    },
-    {
-      "title": "Machine Learning Breakthroughs",
-      "url": "https://nature.com/articles/ml-breakthrough",
-      "snippet": "Researchers have achieved new milestones...",
-      "date": "2024-01-10"
-    }
-  ]
-}
-```
-
-#### Response Fields
-
-| Field | Type | Description |
-|-------|------|-------------|
-| `object` | string | Always `"search"` for search responses |
-| `results` | array | List of search results |
-| `results[].title` | string | Title of the search result |
-| `results[].url` | string | URL of the search result |
-| `results[].snippet` | string | Text snippet from the result |
-| `results[].date` | string | Optional publication or last updated date |
-
-## **Supported Providers**
-
-| Provider | Environment Variable | `search_provider` Value |
-|----------|---------------------|------------------------|
-| Perplexity AI | `PERPLEXITYAI_API_KEY` | `perplexity` |
-| Tavily | `TAVILY_API_KEY` | `tavily` |
-| Exa AI | `EXA_API_KEY` | `exa_ai` |
-| Parallel AI | `PARALLEL_AI_API_KEY` | `parallel_ai` |
-| Google PSE | `GOOGLE_PSE_API_KEY`, `GOOGLE_PSE_ENGINE_ID` | `google_pse` |
-| DataForSEO | `DATAFORSEO_LOGIN`, `DATAFORSEO_PASSWORD` | `dataforseo` |
-
-### Perplexity AI
-
-**Get API Key:** [https://www.perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="Perplexity Search"
-import os
-from litellm import search
-
-os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
-
-response = search(
-    query="latest AI developments",
-    search_provider="perplexity",
-    max_results=5
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: perplexity-search
-    litellm_params:
-      search_provider: perplexity
-      api_key: os.environ/PERPLEXITYAI_API_KEY
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/perplexity-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 5
-  }'
-```
-
-### Tavily
-
-**Get API Key:** [https://tavily.com](https://tavily.com)
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="Tavily Search"
-import os
-from litellm import search
-
-os.environ["TAVILY_API_KEY"] = "tvly-..."
-
-response = search(
-    query="latest AI developments",
-    search_provider="tavily",
-    max_results=5
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: tavily-search
-    litellm_params:
-      search_provider: tavily
-      api_key: os.environ/TAVILY_API_KEY
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/tavily-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 5
-  }'
-```
-
-### Exa AI
-
-**Get API Key:** [https://exa.ai](https://exa.ai)
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="Exa AI Search"
-import os
-from litellm import search
-
-os.environ["EXA_API_KEY"] = "exa-..."
-
-response = search(
-    query="latest AI developments",
-    search_provider="exa_ai",
-    max_results=5
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: exa-search
-    litellm_params:
-      search_provider: exa_ai
-      api_key: os.environ/EXA_API_KEY
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/exa-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 5
-  }'
-```
-
-### Parallel AI
-
-**Get API Key:** [https://www.parallel.ai](https://www.parallel.ai)
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="Parallel AI Search"
-import os
-from litellm import search
-
-os.environ["PARALLEL_AI_API_KEY"] = "..."
-
-response = search(
-    query="latest AI developments",
-    search_provider="parallel_ai",
-    max_results=5
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: parallel-search
-    litellm_params:
-      search_provider: parallel_ai
-      api_key: os.environ/PARALLEL_AI_API_KEY
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/parallel-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 5
-  }'
-```
-
-### Google Programmable Search Engine (PSE)
-
-**Get API Key:** [Google Cloud Console](https://console.cloud.google.com/apis/credentials)
-**Create Search Engine:** [Programmable Search Engine](https://programmablesearchengine.google.com/)
-
-#### Setup
-
-1. Go to [Google Developers Programmable Search Engine](https://programmablesearchengine.google.com/) and log in or create an account
-2. Click the **Add** button in the control panel
-3. Enter a search engine name and configure properties:
-   - Choose which sites to search (entire web or specific sites)
-   - Set language and other preferences
-   - Verify you're not a robot
-4. Click **Create** button
-5. Once created, you'll see:
-   - **Search engine ID (cx)** - Copy this for `GOOGLE_PSE_ENGINE_ID`
-   - Instructions to get your API key
-6. Generate API key:
-   - Go to [Google Cloud Console - Credentials](https://console.cloud.google.com/apis/credentials)
-   - Create a new API key or use existing one
-   - Enable **Custom Search API** for your project
-   - Copy the API key for `GOOGLE_PSE_API_KEY`
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="Google PSE Search"
-import os
-from litellm import search
-
-os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
-os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
-
-response = search(
-    query="latest AI developments",
-    search_provider="google_pse",
-    max_results=10
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: google-search
-    litellm_params:
-      search_provider: google_pse
-      api_key: os.environ/GOOGLE_PSE_API_KEY
-      search_engine_id: os.environ/GOOGLE_PSE_ENGINE_ID
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/google-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 10
-  }'
-```
-
-### DataForSEO
-
-**Get API Access:** [DataForSEO](https://dataforseo.com/)
-
-#### Setup
-
-1. Go to [DataForSEO](https://dataforseo.com/) and create an account
-2. Navigate to your account dashboard
-3. Generate API credentials:
-   - You'll receive a **login** (username)
-   - You'll receive a **password**
-4. Set up your environment variables:
-   - `DATAFORSEO_LOGIN` - Your DataForSEO login/username
-   - `DATAFORSEO_PASSWORD` - Your DataForSEO password
-
-#### LiteLLM Python SDK
-
-```python showLineNumbers title="DataForSEO Search"
-import os
-from litellm import search
-
-os.environ["DATAFORSEO_LOGIN"] = "your-login"
-os.environ["DATAFORSEO_PASSWORD"] = "your-password"
-
-response = search(
-    query="latest AI developments",
-    search_provider="dataforseo",
-    max_results=10
-)
-```
-
-#### LiteLLM AI Gateway
-
-**1. Setup config.yaml**
-
-```yaml showLineNumbers title="config.yaml"
-model_list:
-  - model_name: gpt-4
-    litellm_params:
-      model: gpt-4
-      api_key: os.environ/OPENAI_API_KEY
-
-search_tools:
-  - search_tool_name: dataforseo-search
-    litellm_params:
-      search_provider: dataforseo
-      api_key: "os.environ/DATAFORSEO_LOGIN:os.environ/DATAFORSEO_PASSWORD"
-```
-
-**2. Start the proxy**
-
-```bash
-litellm --config /path/to/config.yaml
-
-# RUNNING on http://0.0.0.0:4000
-```
-
-**3. Test the search endpoint**
-
-```bash showLineNumbers title="Test Request"
-curl http://0.0.0.0:4000/v1/search/dataforseo-search \
-  -H "Authorization: Bearer sk-1234" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "query": "latest AI developments",
-    "max_results": 10
-  }'
-```
-
-
-
-## Provider-specific parameters
-
-Sending provider-specific parameters is supported for all providers, you just need to pass them in the request body.
-
-#### Tavily Search
-
-```python showLineNumbers title="Tavily Search"
-import os
-from litellm import search
-
-os.environ["TAVILY_API_KEY"] = "tvly-..."
-
-response = search(
-    query="latest tech news",
-    search_provider="tavily",
-    max_results=5,
-    # Tavily-specific parameters
-    topic="news",                    # 'general', 'news', 'finance'
-    search_depth="advanced",         # 'basic', 'advanced'
-    include_answer=True,             # Include AI-generated answer
-    include_raw_content=True         # Include raw HTML content
-)
-```
-
-#### Exa AI Search
-
-```python showLineNumbers title="Exa AI Search"
-import os
-from litellm import search
-
-os.environ["EXA_API_KEY"] = "exa-..."
-
-response = search(
-    query="AI research papers",
-    search_provider="exa_ai",
-    max_results=10,
-    search_domain_filter=["arxiv.org"],
-    # Exa-specific parameters
-    type="neural",                   # 'neural', 'keyword', or 'auto'
-    contents={"text": True},         # Request text content
-    use_autoprompt=True              # Enable Exa's autoprompt
-)
-```
-
-#### Parallel AI Search
-
-```python showLineNumbers title="Parallel AI Search"
-import os
-from litellm import search
-
-os.environ["PARALLEL_AI_API_KEY"] = "..."
-
-response = search(
-    query="latest developments in quantum computing",
-    search_provider="parallel_ai",
-    max_results=5,
-    # Parallel AI-specific parameters
-    processor="pro",                 # 'base' or 'pro'
-    max_chars_per_result=500         # Max characters per result
-)
-```
-
-#### Google PSE Search
-
-```python showLineNumbers title="Google PSE Search"
-import os
-from litellm import search
-
-os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
-os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
-
-response = search(
-    query="latest AI research papers",
-    search_provider="google_pse",
-    max_results=10,
-    search_domain_filter=["arxiv.org"],
-    # Google PSE-specific parameters (use actual Google PSE API parameter names)
-    dateRestrict="m6",               # 'm6' = last 6 months, 'd7' = last 7 days
-    lr="lang_en",                    # Language restriction (e.g., 'lang_en', 'lang_es')
-    safe="active",                   # Search safety level ('active' or 'off')
-    exactTerms="machine learning",   # Phrase that all documents must contain
-    fileType="pdf"                   # File type to restrict results to
-)
-```
-
-#### DataForSEO Search
-
-```python showLineNumbers title="DataForSEO Search"
-import os
-from litellm import search
-
-os.environ["DATAFORSEO_LOGIN"] = "your-login"
-os.environ["DATAFORSEO_PASSWORD"] = "your-password"
-
-response = search(
-    query="AI developments",
-    search_provider="dataforseo",
-    max_results=10,
-    # DataForSEO-specific parameters
-    country="United States",       # Country name for location_name
-    language_code="en",            # Language code
-    depth=20,                      # Number of results (max 700)
-    device="desktop",              # Device type ('desktop', 'mobile', 'tablet')
-    os="windows"                   # Operating system
-)
-```
--- a/docs/my-website/docs/search/dataforseo.md
+++ b/docs/my-website/docs/search/dataforseo.md
@@ -0,0 +1,91 @@
+# DataForSEO Search
+
+**Get API Access:** [DataForSEO](https://dataforseo.com/)
+
+## Setup
+
+1. Go to [DataForSEO](https://dataforseo.com/) and create an account
+2. Navigate to your account dashboard
+3. Generate API credentials:
+   - You'll receive a **login** (username)
+   - You'll receive a **password**
+4. Set up your environment variables:
+   - `DATAFORSEO_LOGIN` - Your DataForSEO login/username
+   - `DATAFORSEO_PASSWORD` - Your DataForSEO password
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="DataForSEO Search"
+import os
+from litellm import search
+
+os.environ["DATAFORSEO_LOGIN"] = "your-login"
+os.environ["DATAFORSEO_PASSWORD"] = "your-password"
+
+response = search(
+    query="latest AI developments",
+    search_provider="dataforseo",
+    max_results=10
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: dataforseo-search
+    litellm_params:
+      search_provider: dataforseo
+      api_key: "os.environ/DATAFORSEO_LOGIN:os.environ/DATAFORSEO_PASSWORD"
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/dataforseo-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 10
+  }'
+```
+
+## Provider-specific Parameters
+
+```python showLineNumbers title="DataForSEO Search with Provider-specific Parameters"
+import os
+from litellm import search
+
+os.environ["DATAFORSEO_LOGIN"] = "your-login"
+os.environ["DATAFORSEO_PASSWORD"] = "your-password"
+
+response = search(
+    query="AI developments",
+    search_provider="dataforseo",
+    max_results=10,
+    # DataForSEO-specific parameters
+    country="United States",       # Country name for location_name
+    language_code="en",            # Language code
+    depth=20,                      # Number of results (max 700)
+    device="desktop",              # Device type ('desktop', 'mobile', 'tablet')
+    os="windows"                   # Operating system
+)
+```
+
--- a/docs/my-website/docs/search/exa_ai.md
+++ b/docs/my-website/docs/search/exa_ai.md
@@ -0,0 +1,77 @@
+# Exa AI Search
+
+**Get API Key:** [https://exa.ai](https://exa.ai)
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="Exa AI Search"
+import os
+from litellm import search
+
+os.environ["EXA_API_KEY"] = "exa-..."
+
+response = search(
+    query="latest AI developments",
+    search_provider="exa_ai",
+    max_results=5
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: exa-search
+    litellm_params:
+      search_provider: exa_ai
+      api_key: os.environ/EXA_API_KEY
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/exa-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 5
+  }'
+```
+
+## Provider-specific Parameters
+
+```python showLineNumbers title="Exa AI Search with Provider-specific Parameters"
+import os
+from litellm import search
+
+os.environ["EXA_API_KEY"] = "exa-..."
+
+response = search(
+    query="AI research papers",
+    search_provider="exa_ai",
+    max_results=10,
+    search_domain_filter=["arxiv.org"],
+    # Exa-specific parameters
+    type="neural",                   # 'neural', 'keyword', or 'auto'
+    contents={"text": True},         # Request text content
+    use_autoprompt=True              # Enable Exa's autoprompt
+)
+```
+
--- a/docs/my-website/docs/search/google_pse.md
+++ b/docs/my-website/docs/search/google_pse.md
@@ -0,0 +1,101 @@
+# Google Programmable Search Engine (PSE)
+
+**Get API Key:** [Google Cloud Console](https://console.cloud.google.com/apis/credentials)  
+**Create Search Engine:** [Programmable Search Engine](https://programmablesearchengine.google.com/)
+
+## Setup
+
+1. Go to [Google Developers Programmable Search Engine](https://programmablesearchengine.google.com/) and log in or create an account
+2. Click the **Add** button in the control panel
+3. Enter a search engine name and configure properties:
+   - Choose which sites to search (entire web or specific sites)
+   - Set language and other preferences
+   - Verify you're not a robot
+4. Click **Create** button
+5. Once created, you'll see:
+   - **Search engine ID (cx)** - Copy this for `GOOGLE_PSE_ENGINE_ID`
+   - Instructions to get your API key
+6. Generate API key:
+   - Go to [Google Cloud Console - Credentials](https://console.cloud.google.com/apis/credentials)
+   - Create a new API key or use existing one
+   - Enable **Custom Search API** for your project
+   - Copy the API key for `GOOGLE_PSE_API_KEY`
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="Google PSE Search"
+import os
+from litellm import search
+
+os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
+os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
+
+response = search(
+    query="latest AI developments",
+    search_provider="google_pse",
+    max_results=10
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: google-search
+    litellm_params:
+      search_provider: google_pse
+      api_key: os.environ/GOOGLE_PSE_API_KEY
+      search_engine_id: os.environ/GOOGLE_PSE_ENGINE_ID
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/google-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 10
+  }'
+```
+
+## Provider-specific Parameters
+
+```python showLineNumbers title="Google PSE Search with Provider-specific Parameters"
+import os
+from litellm import search
+
+os.environ["GOOGLE_PSE_API_KEY"] = "AIza..."
+os.environ["GOOGLE_PSE_ENGINE_ID"] = "your-search-engine-id"
+
+response = search(
+    query="latest AI research papers",
+    search_provider="google_pse",
+    max_results=10,
+    search_domain_filter=["arxiv.org"],
+    # Google PSE-specific parameters (use actual Google PSE API parameter names)
+    dateRestrict="m6",               # 'm6' = last 6 months, 'd7' = last 7 days
+    lr="lang_en",                    # Language restriction (e.g., 'lang_en', 'lang_es')
+    safe="active",                   # Search safety level ('active' or 'off')
+    exactTerms="machine learning",   # Phrase that all documents must contain
+    fileType="pdf"                   # File type to restrict results to
+)
+```
+
--- a/docs/my-website/docs/search/index.md
+++ b/docs/my-website/docs/search/index.md
@@ -0,0 +1,272 @@
+# Overview
+
+| Feature | Supported | 
+|---------|-----------|
+| Supported Providers | `perplexity`, `tavily`, `parallel_ai`, `exa_ai`, `google_pse`, `dataforseo` |
+| Cost Tracking | ✅ |
+| Logging | ✅ |
+| Load Balancing | ❌ |
+
+:::tip
+
+LiteLLM follows the [Perplexity API request/response for the Search API](https://docs.perplexity.ai/api-reference/search-post)
+
+:::
+
+:::info
+
+Supported from LiteLLM v1.78.7+
+:::
+
+## **LiteLLM Python SDK Usage**
+### Quick Start 
+
+```python showLineNumbers title="Basic Search"
+from litellm import search
+import os
+
+os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
+
+response = search(
+    query="latest AI developments in 2024",
+    search_provider="perplexity",
+    max_results=5
+)
+
+# Access search results
+for result in response.results:
+    print(f"{result.title}: {result.url}")
+    print(f"Snippet: {result.snippet}\n")
+```
+
+### Async Usage 
+
+```python showLineNumbers title="Async Search"
+from litellm import asearch
+import os, asyncio
+
+os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
+
+async def search_async(): 
+    response = await asearch(
+        query="machine learning research papers",
+        search_provider="perplexity",
+        max_results=10,
+        search_domain_filter=["arxiv.org", "nature.com"]
+    )
+    
+    # Access search results
+    for result in response.results:
+        print(f"{result.title}: {result.url}")
+        print(f"Snippet: {result.snippet}")
+
+asyncio.run(search_async())
+```
+
+### Optional Parameters
+
+```python showLineNumbers title="Search with Options"
+response = search(
+    query="AI developments",
+    search_provider="perplexity",
+    # Unified parameters (work across all providers)
+    max_results=10,                         # Maximum number of results (1-20)
+    search_domain_filter=["arxiv.org"],     # Filter to specific domains
+    country="US",                           # Country code filter
+    max_tokens_per_page=1024                # Max tokens per page
+)
+```
+
+## **LiteLLM AI Gateway Usage**
+
+LiteLLM provides a Perplexity API compatible `/search` endpoint for search calls.
+
+**Setup**
+
+Add this to your litellm proxy config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: perplexity-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITYAI_API_KEY
+  
+  - search_tool_name: tavily-search
+    litellm_params:
+      search_provider: tavily
+      api_key: os.environ/TAVILY_API_KEY
+```
+
+Start litellm
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### Test Request
+
+**Option 1: Search tool name in URL (Recommended - keeps body Perplexity-compatible)**
+
+```bash showLineNumbers title="cURL Request"
+curl http://0.0.0.0:4000/v1/search/perplexity-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments 2024",
+    "max_results": 5,
+    "search_domain_filter": ["arxiv.org", "nature.com"],
+    "country": "US"
+  }'
+```
+
+**Option 2: Search tool name in body**
+
+```bash showLineNumbers title="cURL Request with search_tool_name in body"
+curl http://0.0.0.0:4000/v1/search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "search_tool_name": "perplexity-search",
+    "query": "latest AI developments 2024",
+    "max_results": 5
+  }'
+```
+
+### Load Balancing
+
+Configure multiple search providers for automatic load balancing and fallbacks:
+
+```yaml showLineNumbers title="config.yaml with load balancing"
+search_tools:
+  - search_tool_name: my-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITYAI_API_KEY
+  
+  - search_tool_name: my-search
+    litellm_params:
+      search_provider: tavily
+      api_key: os.environ/TAVILY_API_KEY
+  
+  - search_tool_name: my-search
+    litellm_params:
+      search_provider: exa_ai
+      api_key: os.environ/EXA_API_KEY
+
+router_settings:
+  routing_strategy: simple-shuffle  # or 'least-busy', 'latency-based-routing'
+```
+
+Test with load balancing:
+
+```bash
+curl http://0.0.0.0:4000/v1/search/my-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "AI developments",
+    "max_results": 10
+  }'
+```
+
+## **Request/Response Format**
+
+:::info
+
+LiteLLM follows the **Perplexity Search API specification**. 
+
+See the [official Perplexity Search documentation](https://docs.perplexity.ai/api-reference/search-post) for complete details.
+
+:::
+
+### Example Request
+
+```json showLineNumbers title="Search Request"
+{
+  "query": "latest AI developments 2024",
+  "max_results": 10,
+  "search_domain_filter": ["arxiv.org", "nature.com"],
+  "country": "US",
+  "max_tokens_per_page": 1024
+}
+```
+
+### Request Parameters
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `query` | string or array | Yes | Search query. Can be a single string or array of strings |
+| `search_provider` | string | Yes (SDK) | The search provider to use: `"perplexity"`, `"tavily"`, `"parallel_ai"`, `"exa_ai"`, or `"google_pse"` |
+| `search_tool_name` | string | Yes (Proxy) | Name of the search tool configured in `config.yaml` |
+| `max_results` | integer | No | Maximum number of results to return (1-20). Default: 10 |
+| `search_domain_filter` | array | No | List of domains to filter results (max 20 domains) |
+| `max_tokens_per_page` | integer | No | Maximum tokens per page to process. Default: 1024 |
+| `country` | string | No | Country code filter (e.g., `"US"`, `"GB"`, `"DE"`) |
+
+**Query Format Examples:**
+
+```python
+# Single query
+query = "AI developments"
+
+# Multiple queries
+query = ["AI developments", "machine learning trends"]
+```
+
+### Response Format
+
+The response follows Perplexity's search format with the following structure:
+
+```json showLineNumbers title="Search Response"
+{
+  "object": "search",
+  "results": [
+    {
+      "title": "Latest Advances in Artificial Intelligence",
+      "url": "https://arxiv.org/paper/example",
+      "snippet": "This paper discusses recent developments in AI...",
+      "date": "2024-01-15"
+    },
+    {
+      "title": "Machine Learning Breakthroughs",
+      "url": "https://nature.com/articles/ml-breakthrough",
+      "snippet": "Researchers have achieved new milestones...",
+      "date": "2024-01-10"
+    }
+  ]
+}
+```
+
+#### Response Fields
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `object` | string | Always `"search"` for search responses |
+| `results` | array | List of search results |
+| `results[].title` | string | Title of the search result |
+| `results[].url` | string | URL of the search result |
+| `results[].snippet` | string | Text snippet from the result |
+| `results[].date` | string | Optional publication or last updated date |
+
+## **Supported Providers**
+
+| Provider | Environment Variable | `search_provider` Value |
+|----------|---------------------|------------------------|
+| Perplexity AI | `PERPLEXITYAI_API_KEY` | `perplexity` |
+| Tavily | `TAVILY_API_KEY` | `tavily` |
+| Exa AI | `EXA_API_KEY` | `exa_ai` |
+| Parallel AI | `PARALLEL_AI_API_KEY` | `parallel_ai` |
+| Google PSE | `GOOGLE_PSE_API_KEY`, `GOOGLE_PSE_ENGINE_ID` | `google_pse` |
+| DataForSEO | `DATAFORSEO_LOGIN`, `DATAFORSEO_PASSWORD` | `dataforseo` |
+
+See the individual provider documentation for detailed setup instructions and provider-specific parameters.
+
--- a/docs/my-website/docs/search/parallel_ai.md
+++ b/docs/my-website/docs/search/parallel_ai.md
@@ -0,0 +1,75 @@
+# Parallel AI Search
+
+**Get API Key:** [https://www.parallel.ai](https://www.parallel.ai)
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="Parallel AI Search"
+import os
+from litellm import search
+
+os.environ["PARALLEL_AI_API_KEY"] = "..."
+
+response = search(
+    query="latest AI developments",
+    search_provider="parallel_ai",
+    max_results=5
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: parallel-search
+    litellm_params:
+      search_provider: parallel_ai
+      api_key: os.environ/PARALLEL_AI_API_KEY
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/parallel-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 5
+  }'
+```
+
+## Provider-specific Parameters
+
+```python showLineNumbers title="Parallel AI Search with Provider-specific Parameters"
+import os
+from litellm import search
+
+os.environ["PARALLEL_AI_API_KEY"] = "..."
+
+response = search(
+    query="latest developments in quantum computing",
+    search_provider="parallel_ai",
+    max_results=5,
+    # Parallel AI-specific parameters
+    processor="pro",                 # 'base' or 'pro'
+    max_chars_per_result=500         # Max characters per result
+)
+```
+
--- a/docs/my-website/docs/search/perplexity.md
+++ b/docs/my-website/docs/search/perplexity.md
@@ -0,0 +1,57 @@
+# Perplexity AI Search
+
+**Get API Key:** [https://www.perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="Perplexity Search"
+import os
+from litellm import search
+
+os.environ["PERPLEXITYAI_API_KEY"] = "pplx-..."
+
+response = search(
+    query="latest AI developments",
+    search_provider="perplexity",
+    max_results=5
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: perplexity-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITYAI_API_KEY
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/perplexity-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 5
+  }'
+```
+
--- a/docs/my-website/docs/search/tavily.md
+++ b/docs/my-website/docs/search/tavily.md
@@ -0,0 +1,77 @@
+# Tavily Search
+
+**Get API Key:** [https://tavily.com](https://tavily.com)
+
+## LiteLLM Python SDK
+
+```python showLineNumbers title="Tavily Search"
+import os
+from litellm import search
+
+os.environ["TAVILY_API_KEY"] = "tvly-..."
+
+response = search(
+    query="latest AI developments",
+    search_provider="tavily",
+    max_results=5
+)
+```
+
+## LiteLLM AI Gateway
+
+### 1. Setup config.yaml
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+  - model_name: gpt-4
+    litellm_params:
+      model: gpt-4
+      api_key: os.environ/OPENAI_API_KEY
+
+search_tools:
+  - search_tool_name: tavily-search
+    litellm_params:
+      search_provider: tavily
+      api_key: os.environ/TAVILY_API_KEY
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+### 3. Test the search endpoint
+
+```bash showLineNumbers title="Test Request"
+curl http://0.0.0.0:4000/v1/search/tavily-search \
+  -H "Authorization: Bearer sk-1234" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "latest AI developments",
+    "max_results": 5
+  }'
+```
+
+## Provider-specific Parameters
+
+```python showLineNumbers title="Tavily Search with Provider-specific Parameters"
+import os
+from litellm import search
+
+os.environ["TAVILY_API_KEY"] = "tvly-..."
+
+response = search(
+    query="latest tech news",
+    search_provider="tavily",
+    max_results=5,
+    # Tavily-specific parameters
+    topic="news",                    # 'general', 'news', 'finance'
+    search_depth="advanced",         # 'basic', 'advanced'
+    include_answer=True,             # Include AI-generated answer
+    include_raw_content=True         # Include raw HTML content
+)
+```
+
--- a/docs/my-website/sidebars.js
+++ b/docs/my-website/sidebars.js
@@ -381,7 +381,19 @@ const sidebars = {
        "realtime",
        "rerank",
        "response_api",
-        "search",
+        {
+          type: "category",
+          label: "/search",
+          items: [
+            "search/index",
+            "search/perplexity",
+            "search/tavily",
+            "search/exa_ai",
+            "search/parallel_ai",
+            "search/google_pse",
+            "search/dataforseo",
+          ]
+        },
        {
          type: "category",
          label: "/vector_stores",
--- a/litellm/cost_calculator.py
+++ b/litellm/cost_calculator.py
@@ -29,6 +29,7 @@ from litellm.llms.anthropic.cost_calculation import (
 from litellm.llms.azure.cost_calculation import (
    cost_per_token as azure_openai_cost_per_token,
 )
+from litellm.llms.base_llm.search.transformation import SearchResponse
 from litellm.llms.bedrock.cost_calculation import (
    cost_per_token as bedrock_cost_per_token,
 )
@@ -315,6 +316,16 @@ def cost_per_token(  # noqa: PLR0915
            custom_llm_provider=custom_llm_provider,
            duration=audio_transcription_file_duration,
        )
+    elif call_type == "search" or call_type == "asearch":
+        # Search providers use per-query pricing
+        from litellm.search import search_provider_cost_per_query
+        
+        return search_provider_cost_per_query(
+            model=model,
+            custom_llm_provider=custom_llm_provider,
+            number_of_queries=number_of_queries or 1,
+            optional_params=response._hidden_params if response and hasattr(response, "_hidden_params") else None
+        )
    elif custom_llm_provider == "vertex_ai":
        cost_router = google_cost_router(
            model=model_without_prefix,
@@ -1094,6 +1105,7 @@ def response_cost_calculator(
        LiteLLMRealtimeStreamLoggingObject,
        OpenAIModerationResponse,
        Response,
+        SearchResponse,
    ],
    model: str,
    custom_llm_provider: Optional[str],
@@ -1114,6 +1126,8 @@ def response_cost_calculator(
        "speech",
        "rerank",
        "arerank",
+        "search",
+        "asearch",
    ],
    optional_params: dict,
    cache_hit: Optional[bool] = None,
--- a/litellm/llms/exa_ai/search/transformation.py
+++ b/litellm/llms/exa_ai/search/transformation.py
@@ -67,7 +67,7 @@ class ExaAISearchConfig(BaseSearchConfig):
        self,
        api_base: Optional[str],
        optional_params: dict,
-        data: Optional[dict] = None,
+        data: Optional[Union[Dict, List[Dict]]] = None,
        **kwargs,
    ) -> str:
        """
--- a/litellm/llms/google_pse/search/transformation.py
+++ b/litellm/llms/google_pse/search/transformation.py
@@ -90,7 +90,7 @@ class GooglePSESearchConfig(BaseSearchConfig):
        self,
        api_base: Optional[str],
        optional_params: dict,
-        data: Optional[dict] = None,
+        data: Optional[Union[Dict, List[Dict]]] = None,
        **kwargs,
    ) -> str:
        """
@@ -104,7 +104,7 @@ class GooglePSESearchConfig(BaseSearchConfig):
        api_base = api_base or get_secret_str("GOOGLE_PSE_API_BASE") or self.GOOGLE_PSE_API_BASE
        
        # Build query parameters from the transformed request body
-        if data and "_google_pse_params" in data:
+        if data and isinstance(data, dict) and "_google_pse_params" in data:
            params = data["_google_pse_params"]
            query_string = urlencode(params)
            return f"{api_base}?{query_string}"
--- a/litellm/llms/tavily/search/transformation.py
+++ b/litellm/llms/tavily/search/transformation.py
@@ -66,7 +66,7 @@ class TavilySearchConfig(BaseSearchConfig):
        self,
        api_base: Optional[str],
        optional_params: dict,
-        data: Optional[dict] = None,
+        data: Optional[Union[Dict, List[Dict]]] = None,
        **kwargs,
    ) -> str:
        """
--- a/litellm/model_prices_and_context_window_backup.json
+++ b/litellm/model_prices_and_context_window_backup.json
@@ -12,7 +12,7 @@
        "max_input_tokens": "max input tokens, if the provider specifies it. if not default to max_tokens",
        "max_output_tokens": "max output tokens, if the provider specifies it. if not default to max_tokens",
        "max_tokens": "LEGACY parameter. set to max_output_tokens if provider specifies it. IF not set to max_input_tokens, if provider specifies it.",
-        "mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank",
+        "mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank, search",
        "output_cost_per_reasoning_token": 0.0,
        "output_cost_per_token": 0.0,
        "search_context_cost_per_query": {
@@ -6460,6 +6460,11 @@
        "source": "https://www.databricks.com/product/pricing/foundation-model-serving",
        "supports_tool_choice": true
    },
+    "dataforseo/search": {
+        "input_cost_per_query": 0.003,
+        "litellm_provider": "dataforseo",
+        "mode": "search"
+    },
    "davinci-002": {
        "input_cost_per_token": 2e-06,
        "litellm_provider": "text-completion-openai",
@@ -7800,6 +7805,31 @@
        "output_cost_per_token": 0.0,
        "output_vector_size": 2560
    },
+    "exa_ai/search": {
+        "litellm_provider": "exa_ai",
+        "mode": "search",
+        "tiered_pricing": [
+            {
+                "input_cost_per_query": 5e-03,
+                "max_results_range": [
+                    0,
+                    25
+                ]
+            },
+            {
+                "input_cost_per_query": 25e-03,
+                "max_results_range": [
+                    26,
+                    100
+                ]
+            }
+        ]
+    },
+    "perplexity/search": {
+        "input_cost_per_query": 5e-03,
+        "litellm_provider": "perplexity",
+        "mode": "search"
+    },    
    "elevenlabs/scribe_v1": {
        "input_cost_per_second": 6.11e-05,
        "litellm_provider": "elevenlabs",
@@ -12211,6 +12241,11 @@
            "video"
        ]
    },
+    "google_pse/search": {
+        "input_cost_per_query": 0.005,
+        "litellm_provider": "google_pse",
+        "mode": "search"
+    },
    "global.anthropic.claude-sonnet-4-5-20250929-v1:0": {
        "cache_creation_input_token_cost": 3.75e-06,
        "cache_read_input_token_cost": 3e-07,
@@ -18802,6 +18837,16 @@
        "output_cost_per_token": 1.25e-07,
        "source": "https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#foundation_models"
    },
+    "parallel_ai/search": {
+        "input_cost_per_query": 0.004,
+        "litellm_provider": "parallel_ai",
+        "mode": "search"
+    },
+    "parallel_ai/search-pro": {
+        "input_cost_per_query": 0.009,
+        "litellm_provider": "parallel_ai",
+        "mode": "search"
+    },
    "perplexity/codellama-34b-instruct": {
        "input_cost_per_token": 3.5e-07,
        "litellm_provider": "perplexity",
@@ -19812,6 +19857,16 @@
        "mode": "image_generation",
        "output_cost_per_pixel": 0.0
    },
+    "tavily/search": {
+        "input_cost_per_query": 0.008,
+        "litellm_provider": "tavily",
+        "mode": "search"
+    },
+    "tavily/search-advanced": {
+        "input_cost_per_query": 0.016,
+        "litellm_provider": "tavily",
+        "mode": "search"
+    },
    "text-bison": {
        "input_cost_per_character": 2.5e-07,
        "litellm_provider": "vertex_ai-text-models",
--- a/litellm/search/init.py
+++ b/litellm/search/init.py
@@ -1,7 +1,8 @@
 """
 LiteLLM Search API module.
 """
+from litellm.search.cost_calculator import search_provider_cost_per_query
 from litellm.search.main import asearch, search

-__all__ = ["search", "asearch"]
+__all__ = ["search", "asearch", "search_provider_cost_per_query"]

--- a/litellm/search/cost_calculator.py
+++ b/litellm/search/cost_calculator.py
@@ -0,0 +1,52 @@
+"""
+Cost calculation for search providers.
+"""
+from typing import Optional, Tuple
+
+from litellm.utils import get_model_info
+
+
+def search_provider_cost_per_query(
+    model: str,
+    custom_llm_provider: Optional[str] = None,
+    number_of_queries: int = 1,
+    optional_params: Optional[dict] = None,
+) -> Tuple[float, float]:
+    """
+    Calculate cost for search-only providers.
+    
+    Returns (input_cost, output_cost) where input_cost = queries * cost_per_query
+    Supports tiered pricing based on max_results parameter.
+    
+    Args:
+        model: Model name (e.g., "exa_ai/search", "tavily/search")
+        custom_llm_provider: Provider name (e.g., "exa_ai", "tavily")
+        number_of_queries: Number of search queries performed (default: 1)
+        optional_params: Optional parameters including max_results for tiered pricing
+        
+    Returns:
+        Tuple of (input_cost, output_cost) where output_cost is always 0.0
+    """
+    model_info = get_model_info(model=model, custom_llm_provider=custom_llm_provider)
+    
+    # Check for tiered pricing (e.g., Exa AI based on max_results)
+    tiered_pricing = model_info.get("tiered_pricing")
+    if tiered_pricing and isinstance(tiered_pricing, list):
+        max_results = (optional_params or {}).get("max_results", 10)  # default 10 results
+        cost_per_query = 0.0
+        
+        for tier in tiered_pricing:
+            range_min, range_max = tier["max_results_range"]
+            if range_min <= max_results <= range_max:
+                cost_per_query = tier["input_cost_per_query"]
+                break
+        else:
+            # Fallback to highest tier if out of range
+            cost_per_query = tiered_pricing[-1]["input_cost_per_query"]
+    else:
+        # Simple flat rate
+        cost_per_query = float(model_info.get("input_cost_per_query") or 0.0)
+    
+    total_cost = number_of_queries * cost_per_query
+    return (total_cost, 0.0)  # (input_cost, output_cost)
+
--- a/litellm/search/main.py
+++ b/litellm/search/main.py
@@ -149,8 +149,9 @@ async def asearch(

        return response
    except Exception as e:
+        model_name = f"{search_provider}/search"
        raise litellm.exception_type(
-            model="",
+            model=model_name,
            custom_llm_provider=search_provider,
            original_exception=e,
            completion_kwargs=local_vars,
@@ -287,8 +288,9 @@ def search(
        )

        # Pre Call logging
+        model_name = f"{search_provider}/search"
        litellm_logging_obj.update_environment_variables(
-            model="",
+            model=model_name,
            optional_params=optional_params,
            litellm_params={
                "litellm_call_id": litellm_call_id,
@@ -313,8 +315,9 @@ def search(

        return response
    except Exception as e:
+        model_name = f"{search_provider}/search"
        raise litellm.exception_type(
-            model="",
+            model=model_name,
            custom_llm_provider=search_provider,
            original_exception=e,
            completion_kwargs=local_vars,
--- a/litellm/types/utils.py
+++ b/litellm/types/utils.py
@@ -239,6 +239,8 @@ class CallTypes(str, Enum):
    speech = "speech"
    rerank = "rerank"
    arerank = "arerank"
+    search = "search"
+    asearch = "asearch"
    arealtime = "_arealtime"
    create_batch = "create_batch"
    acreate_batch = "acreate_batch"
@@ -321,6 +323,8 @@ CallTypesLiteral = Literal[
    "speech",
    "rerank",
    "arerank",
+    "search",
+    "asearch",
    "_arealtime",
    "create_batch",
    "acreate_batch",
--- a/model_prices_and_context_window.json
+++ b/model_prices_and_context_window.json
@@ -12,7 +12,7 @@
        "max_input_tokens": "max input tokens, if the provider specifies it. if not default to max_tokens",
        "max_output_tokens": "max output tokens, if the provider specifies it. if not default to max_tokens",
        "max_tokens": "LEGACY parameter. set to max_output_tokens if provider specifies it. IF not set to max_input_tokens, if provider specifies it.",
-        "mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank",
+        "mode": "one of: chat, embedding, completion, image_generation, audio_transcription, audio_speech, image_generation, moderation, rerank, search",
        "output_cost_per_reasoning_token": 0.0,
        "output_cost_per_token": 0.0,
        "search_context_cost_per_query": {
@@ -6460,6 +6460,11 @@
        "source": "https://www.databricks.com/product/pricing/foundation-model-serving",
        "supports_tool_choice": true
    },
+    "dataforseo/search": {
+        "input_cost_per_query": 0.003,
+        "litellm_provider": "dataforseo",
+        "mode": "search"
+    },
    "davinci-002": {
        "input_cost_per_token": 2e-06,
        "litellm_provider": "text-completion-openai",
@@ -7800,6 +7805,31 @@
        "output_cost_per_token": 0.0,
        "output_vector_size": 2560
    },
+    "exa_ai/search": {
+        "litellm_provider": "exa_ai",
+        "mode": "search",
+        "tiered_pricing": [
+            {
+                "input_cost_per_query": 5e-03,
+                "max_results_range": [
+                    0,
+                    25
+                ]
+            },
+            {
+                "input_cost_per_query": 25e-03,
+                "max_results_range": [
+                    26,
+                    100
+                ]
+            }
+        ]
+    },
+    "perplexity/search": {
+        "input_cost_per_query": 5e-03,
+        "litellm_provider": "perplexity",
+        "mode": "search"
+    },    
    "elevenlabs/scribe_v1": {
        "input_cost_per_second": 6.11e-05,
        "litellm_provider": "elevenlabs",
@@ -12211,6 +12241,11 @@
            "video"
        ]
    },
+    "google_pse/search": {
+        "input_cost_per_query": 0.005,
+        "litellm_provider": "google_pse",
+        "mode": "search"
+    },
    "global.anthropic.claude-sonnet-4-5-20250929-v1:0": {
        "cache_creation_input_token_cost": 3.75e-06,
        "cache_read_input_token_cost": 3e-07,
@@ -18802,6 +18837,16 @@
        "output_cost_per_token": 1.25e-07,
        "source": "https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#foundation_models"
    },
+    "parallel_ai/search": {
+        "input_cost_per_query": 0.004,
+        "litellm_provider": "parallel_ai",
+        "mode": "search"
+    },
+    "parallel_ai/search-pro": {
+        "input_cost_per_query": 0.009,
+        "litellm_provider": "parallel_ai",
+        "mode": "search"
+    },
    "perplexity/codellama-34b-instruct": {
        "input_cost_per_token": 3.5e-07,
        "litellm_provider": "perplexity",
@@ -19812,6 +19857,16 @@
        "mode": "image_generation",
        "output_cost_per_pixel": 0.0
    },
+    "tavily/search": {
+        "input_cost_per_query": 0.008,
+        "litellm_provider": "tavily",
+        "mode": "search"
+    },
+    "tavily/search-advanced": {
+        "input_cost_per_query": 0.016,
+        "litellm_provider": "tavily",
+        "mode": "search"
+    },
    "text-bison": {
        "input_cost_per_character": 2.5e-07,
        "litellm_provider": "vertex_ai-text-models",
--- a/tests/search_tests/base_search_unit_tests.py
+++ b/tests/search_tests/base_search_unit_tests.py
@@ -6,6 +6,7 @@ This follows the same pattern as BaseOCRTest in tests/ocr_tests/base_ocr_unit_te
 import pytest
 import litellm
 from abc import ABC, abstractmethod
+import os
 import json


@@ -37,6 +38,8 @@ class BaseSearchTest(ABC):
        """
        Test basic search functionality with a simple query.
        """
+        os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
+        litellm.model_cost = litellm.get_model_cost_map(url="")
        litellm._turn_on_debug()
        search_provider = self.get_search_provider()
        print("Search Provider=", search_provider)
@@ -77,6 +80,18 @@ class BaseSearchTest(ABC):
            assert len(first_result.url) > 0, "URL should not be empty"
            assert len(first_result.snippet) > 0, "Snippet should not be empty"
            
+            # Validate cost tracking in _hidden_params
+            assert hasattr(response, "_hidden_params"), "Response should have '_hidden_params' attribute"
+            hidden_params = response._hidden_params
+            assert "response_cost" in hidden_params, "_hidden_params should contain 'response_cost'"
+            
+            response_cost = hidden_params["response_cost"]
+            assert response_cost is not None, "response_cost should not be None"
+            assert isinstance(response_cost, (int, float)), "response_cost should be a number"
+            assert response_cost >= 0, "response_cost should be non-negative"
+            
+            print(f"Cost tracking: ${response_cost:.6f}")
+            
        except Exception as e:
            pytest.fail(f"Search call failed: {str(e)}")