mirror of https://github.com/enricoros/big-AGI.git synced 2026-05-10 21:50:14 -07:00

Files

T

paulhshort 3d93c856ba Fix Azure OpenAI web_search_preview tool incompatibility

Azure OpenAI doesn't support the web_search_preview tool, which was causing
"Hosted tool 'web_search_preview' is not supported" errors with GPT-5 models.

## Changes:
- Pass dialect information to aixToOpenAIResponses function
- Skip web_search_preview tool addition when dialect is 'azure'
- Add logging when web search is skipped for Azure
- Document known Azure limitations in implementation guide

## Impact:
- Fixes web browsing errors with Azure GPT-5 models
- Maintains web search functionality for regular OpenAI models
- Provides clear logging for debugging

This is a critical fix for Azure OpenAI compatibility as web search is not
currently supported on Azure's Responses API implementation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-11 20:07:19 -04:00

7.3 KiB

Raw Blame History

Azure OpenAI Fix Implementation - Issue #828

Summary

This implementation fixes the Azure OpenAI "Resource Not Found" errors for newly added models (GPT-5, o3 Pro) by addressing two critical issues:

Host Normalization: Prevents malformed URLs when client configuration includes paths/queries
API Paradigm Support: Properly handles Azure's next-generation v1 Responses API

Changes Made

1. Core Router Fix (`src/modules/llms/server/openai/openai.router.ts`)

Host Normalization

Precedence Change: Server environment variable (AZURE_OPENAI_API_ENDPOINT) now takes precedence over client-provided host
URL Sanitization: Strips any path/query from the host URL to prevent malformed concatenations
Error Handling: Added proper URL validation with clear error messages

API Paradigm Support

Dual Mode Support: Implemented support for both traditional (deployment-based) and next-gen v1 APIs
Smart Routing: Special handling for /v1/responses endpoint based on configuration
Configurable: Can switch between paradigms via environment variables

API Version Management

Centralized API version constants with environment variable overrides:
- AZURE_RESPONSES_API_VERSION (default: 'preview' for v1 API)
- AZURE_CHAT_API_VERSION (default: '2025-02-01-preview')
- AZURE_DEPLOYMENTS_API_VERSION (default: '2023-03-15-preview')
Added AZURE_API_V1 flag to explicitly enable next-gen v1 API

2. Environment Configuration (`src/server/env.ts`)

Added new optional environment variables:

AZURE_API_V1: z.string().optional(), // 'true' to enable next-gen v1 API
AZURE_RESPONSES_API_VERSION: z.string().optional(), // 'preview' for v1
AZURE_CHAT_API_VERSION: z.string().optional(), 
AZURE_DEPLOYMENTS_API_VERSION: z.string().optional(),

How It Works

URL Construction Logic

The fixed Azure case now follows this logic:

Get and normalize the base URL:

const azureHostRaw = env.AZURE_OPENAI_API_ENDPOINT || access.oaiHost || '';
const azureBase = new URL(fixupHost(azureHostRaw, apiPath)).origin;

Determine API paradigm:

const useV1API = AZURE_API_V1_ENABLED || AZURE_RESPONSES_API_VERSION === 'preview';

Route based on endpoint and paradigm:
- Responses API (v1): /openai/v1/responses?api-version=preview
- Responses API (traditional): /openai/deployments/{deployment}/responses?api-version=2025-04-01-preview
- Chat Completions: /openai/deployments/{deployment}/chat/completions?api-version=2025-02-01-preview
- Model Listing: /openai/deployments?api-version=2023-03-15-preview

Configuration Options

Default Configuration (Recommended)

Uses next-gen v1 API for Responses (Azure's recommended approach):

# No additional configuration needed - defaults to v1 for Responses

Traditional API Mode

Force traditional deployment-based API for all endpoints:

AZURE_RESPONSES_API_VERSION=2025-04-01-preview

Explicit v1 Mode

Enable v1 API explicitly:

AZURE_API_V1=true

Custom API Versions

Override specific API versions:

AZURE_CHAT_API_VERSION=2025-02-01-preview
AZURE_DEPLOYMENTS_API_VERSION=2025-02-01-preview

Testing Results

Tested with Azure OpenAI endpoint https://grid4openai.openai.azure.com:

Endpoint	Status	Notes
List Deployments	✅ Success	Found 21 deployments
Chat Completions (Traditional)	✅ Success*	*GPT-5 requires `max_completion_tokens` instead of `max_tokens`
Responses API (v1)	✅ Success	Works with `/openai/v1/responses?api-version=preview`
Responses API (Traditional)	❌ 404	Azure doesn't support deployment-based Responses API

Migration Guide

For Users

No action required if using server environment variables (Vercel, etc.)
If using client configuration: Ensure the Azure endpoint field contains only the base URL (e.g., https://yourinstance.openai.azure.com) without any paths or query parameters

For Developers

The fix defaults to using the next-gen v1 API for Responses
Traditional chat/completions continue to use deployment-based endpoints
Add console logging is included to debug which API paradigm is being used
Environment variables provide full control over API versions and paradigms

Known Limitations

Azure OpenAI Specific Limitations

Web Search Tool: Azure OpenAI doesn't currently support the web_search_preview tool
- The fix automatically disables web search for Azure deployments
- Logs when web search is skipped: [Azure] Skipping web_search_preview tool - not supported on Azure OpenAI
- This affects all Azure models using the Responses API (GPT-5, o-series)
GPT-5 Model Constraints:
- No temperature control (only default value of 1.0 supported)
- Must use max_completion_tokens instead of max_tokens
- These constraints are already handled in the existing code
Image Generation: Multi-turn editing and streaming not yet supported on Azure
File Upload: Images can't be uploaded as files and referenced as input (coming soon)

Benefits

Fixes Issue #828: Resolves "Resource Not Found" errors for GPT-5 and o3 Pro models
Handles Azure Limitations: Automatically disables unsupported features like web search
Future-Proof: Supports Azure's next-generation v1 API
Backward Compatible: Maintains support for existing deployments
Robust: Prevents malformed URLs from client misconfigurations
Configurable: Full control over API paradigms and versions
Debuggable: Includes logging for troubleshooting

Technical Details

Why Two API Paradigms?

Azure OpenAI offers two API styles:

Traditional (Deployment-based):
- Path: /openai/deployments/{deployment-name}/{operation}
- Used for: Chat completions, embeddings, etc.
- API Version: Dated versions like 2025-02-01-preview
Next-Generation v1:
- Path: /openai/v1/{operation}
- Used for: Responses API (new)
- API Version: preview (always latest)
- Benefits: Simpler, more OpenAI-compatible

The Root Cause

The original code treated all /v1/* endpoints identically, converting:

/v1/responses → /openai/deployments/{model}/responses

But Azure's Responses API requires:

/v1/responses → /openai/v1/responses (no deployment in path)

The Solution

Detect Responses API: Check if apiPath === '/v1/responses'
Apply correct routing: Use v1 pattern for Responses, traditional for others
Normalize hosts: Prevent path/query pollution in base URLs
Prioritize server config: Trust server environment over client input

src/modules/llms/server/openai/openai.router.ts: Main fix implementation
src/server/env.ts: Environment variable definitions
test-azure-api.js: Validation test script
docs/azure-openai-issue-828-corrected-analysis.md: Detailed analysis
docs/azure-openai-fix-implementation.md: This document

References

GitHub Issue: https://github.com/enricoros/big-AGI/issues/828
Azure Responses API: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
Azure API Evolution: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle

7.3 KiB Raw Blame History