Azure: remove description of the fix for #828, now it's merged

2026-05-10 21:50:14 -07:00 · 2025-09-12 14:06:13 -07:00
parent aa441b0656
commit a4ad1e8295
3 changed files with 0 additions and 386 deletions
@@ -1,188 +0,0 @@
-# Azure OpenAI Fix Implementation - Issue #828
-
-## Summary
-
-This implementation fixes the Azure OpenAI "Resource Not Found" errors for newly added models (GPT-5, o3 Pro) by addressing two critical issues:
-
-1. **Host Normalization**: Prevents malformed URLs when client configuration includes paths/queries
-2. **API Paradigm Support**: Properly handles Azure's next-generation v1 Responses API
-
-## Changes Made
-
-### 1. Core Router Fix (`src/modules/llms/server/openai/openai.router.ts`)
-
-#### Host Normalization
- **Precedence Change**: Server environment variable (`AZURE_OPENAI_API_ENDPOINT`) now takes precedence over client-provided host
- **URL Sanitization**: Strips any path/query from the host URL to prevent malformed concatenations
- **Error Handling**: Added proper URL validation with clear error messages
-
-#### API Paradigm Support
- **Dual Mode Support**: Implemented support for both traditional (deployment-based) and next-gen v1 APIs
- **Smart Routing**: Special handling for `/v1/responses` endpoint based on configuration
- **Configurable**: Can switch between paradigms via environment variables
-
-#### API Version Management
- Centralized API version constants with environment variable overrides:
-  - `AZURE_RESPONSES_API_VERSION` (default: 'preview' for v1 API)
-  - `AZURE_CHAT_API_VERSION` (default: '2025-02-01-preview')
-  - `AZURE_DEPLOYMENTS_API_VERSION` (default: '2023-03-15-preview')
- Added `AZURE_API_V1` flag to explicitly enable next-gen v1 API
-
-### 2. Environment Configuration (`src/server/env.ts`)
-
-Added new optional environment variables:
-```typescript
-AZURE_API_V1: z.string().optional(), // 'true' to enable next-gen v1 API
-AZURE_RESPONSES_API_VERSION: z.string().optional(), // 'preview' for v1
-AZURE_CHAT_API_VERSION: z.string().optional(), 
-AZURE_DEPLOYMENTS_API_VERSION: z.string().optional(),
-```
-
-## How It Works
-
-### URL Construction Logic
-
-The fixed Azure case now follows this logic:
-
-1. **Get and normalize the base URL**:
-   ```typescript
-   const azureHostRaw = env.AZURE_OPENAI_API_ENDPOINT || access.oaiHost || '';
-   const azureBase = new URL(fixupHost(azureHostRaw, apiPath)).origin;
-   ```
-
-2. **Determine API paradigm**:
-   ```typescript
-   const useV1API = AZURE_API_V1_ENABLED || AZURE_RESPONSES_API_VERSION === 'preview';
-   ```
-
-3. **Route based on endpoint and paradigm**:
-   - **Responses API (v1)**: `/openai/v1/responses?api-version=preview`
-   - **Responses API (traditional)**: `/openai/deployments/{deployment}/responses?api-version=2025-04-01-preview`
-   - **Chat Completions**: `/openai/deployments/{deployment}/chat/completions?api-version=2025-02-01-preview`
-   - **Model Listing**: `/openai/deployments?api-version=2023-03-15-preview`
-
-### Configuration Options
-
-#### Default Configuration (Recommended)
-Uses next-gen v1 API for Responses (Azure's recommended approach):
-```env
-# No additional configuration needed - defaults to v1 for Responses
-```
-
-#### Traditional API Mode
-Force traditional deployment-based API for all endpoints:
-```env
-AZURE_RESPONSES_API_VERSION=2025-04-01-preview
-```
-
-#### Explicit v1 Mode
-Enable v1 API explicitly:
-```env
-AZURE_API_V1=true
-```
-
-#### Custom API Versions
-Override specific API versions:
-```env
-AZURE_CHAT_API_VERSION=2025-02-01-preview
-AZURE_DEPLOYMENTS_API_VERSION=2025-02-01-preview
-```
-
-## Testing Results
-
-Tested with Azure OpenAI endpoint `https://grid4openai.openai.azure.com`:
-
-| Endpoint | Status | Notes |
-|----------|--------|-------|
-| List Deployments | ✅ Success | Found 21 deployments |
-| Chat Completions (Traditional) | ✅ Success* | *GPT-5 requires `max_completion_tokens` instead of `max_tokens` |
-| Responses API (v1) | ✅ Success | Works with `/openai/v1/responses?api-version=preview` |
-| Responses API (Traditional) | ❌ 404 | Azure doesn't support deployment-based Responses API |
-
-## Migration Guide
-
-### For Users
-
-1. **No action required** if using server environment variables (Vercel, etc.)
-2. **If using client configuration**: Ensure the Azure endpoint field contains only the base URL (e.g., `https://yourinstance.openai.azure.com`) without any paths or query parameters
-
-### For Developers
-
-1. The fix defaults to using the next-gen v1 API for Responses
-2. Traditional chat/completions continue to use deployment-based endpoints
-3. Add console logging is included to debug which API paradigm is being used
-4. Environment variables provide full control over API versions and paradigms
-
-## Known Limitations
-
-### Azure OpenAI Specific Limitations
-
-1. **Web Search Tool**: Azure OpenAI doesn't currently support the `web_search_preview` tool
-   - The fix automatically disables web search for Azure deployments
-   - Logs when web search is skipped: `[Azure] Skipping web_search_preview tool - not supported on Azure OpenAI`
-   - This affects all Azure models using the Responses API (GPT-5, o-series)
-
-2. **GPT-5 Model Constraints**:
-   - No temperature control (only default value of 1.0 supported)
-   - Must use `max_completion_tokens` instead of `max_tokens`
-   - These constraints are already handled in the existing code
-
-3. **Image Generation**: Multi-turn editing and streaming not yet supported on Azure
-
-4. **File Upload**: Images can't be uploaded as files and referenced as input (coming soon)
-
-## Benefits
-
-1. **Fixes Issue #828**: Resolves "Resource Not Found" errors for GPT-5 and o3 Pro models
-2. **Handles Azure Limitations**: Automatically disables unsupported features like web search
-3. **Future-Proof**: Supports Azure's next-generation v1 API
-4. **Backward Compatible**: Maintains support for existing deployments
-5. **Robust**: Prevents malformed URLs from client misconfigurations
-6. **Configurable**: Full control over API paradigms and versions
-7. **Debuggable**: Includes logging for troubleshooting
-
-## Technical Details
-
-### Why Two API Paradigms?
-
-Azure OpenAI offers two API styles:
-
-1. **Traditional (Deployment-based)**:
-   - Path: `/openai/deployments/{deployment-name}/{operation}`
-   - Used for: Chat completions, embeddings, etc.
-   - API Version: Dated versions like `2025-02-01-preview`
-
-2. **Next-Generation v1**:
-   - Path: `/openai/v1/{operation}`
-   - Used for: Responses API (new)
-   - API Version: `preview` (always latest)
-   - Benefits: Simpler, more OpenAI-compatible
-
-### The Root Cause
-
-The original code treated all `/v1/*` endpoints identically, converting:
- `/v1/responses` → `/openai/deployments/{model}/responses`
-
-But Azure's Responses API requires:
- `/v1/responses` → `/openai/v1/responses` (no deployment in path)
-
-### The Solution
-
-1. **Detect Responses API**: Check if `apiPath === '/v1/responses'`
-2. **Apply correct routing**: Use v1 pattern for Responses, traditional for others
-3. **Normalize hosts**: Prevent path/query pollution in base URLs
-4. **Prioritize server config**: Trust server environment over client input
-
-## Related Files
-
- `src/modules/llms/server/openai/openai.router.ts`: Main fix implementation
- `src/server/env.ts`: Environment variable definitions
- `test-azure-api.js`: Validation test script
- `docs/azure-openai-issue-828-corrected-analysis.md`: Detailed analysis
- `docs/azure-openai-fix-implementation.md`: This document
-
-## References
-
- GitHub Issue: https://github.com/enricoros/big-AGI/issues/828
- Azure Responses API: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
- Azure API Evolution: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle
@@ -1,171 +0,0 @@
-# Azure OpenAI "Resource Not Found" Issue #828 - Corrected Analysis
-
-## Executive Summary
-
-GitHub issue #828 reports that newly added Azure OpenAI model endpoints (GPT-5, o3 Pro) are failing with "Resource Not Found" errors. The root cause is that the big-AGI codebase incorrectly constructs Azure OpenAI Responses API URLs by treating them like deployment-based endpoints when they should use a different path structure.
-
-## Key Facts
-
-1. **Azure DOES support the Responses API** - Documented at:
-   - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
-   - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle#api-evolution
-   - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/reference-preview-latest#create-response
-
-2. **User configuration is correct** - The base endpoint `https://grid4openai.openai.azure.com/` is properly configured in Vercel
-
-3. **Models work in Azure AI Foundry** - The deployments function correctly in Azure's playground and other compatible applications
-
-## Root Cause Analysis
-
-### The URL Construction Bug
-
-In `src/modules/llms/server/openai/openai.router.ts` (lines 539-542), the code incorrectly handles the Azure Responses API endpoint:
-
-```typescript
-if (apiPath.startsWith('/v1/')) {
-  if (!modelRefId)
-    throw new Error('Azure OpenAI API needs a deployment id');
-  url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
-}
-```
-
-This code treats ALL `/v1/` paths identically, resulting in:
-
-**For `/v1/responses`:**
- **Current (WRONG)**: `https://grid4openai.openai.azure.com/openai/deployments/{model-id}/responses?api-version=2025-02-01-preview`
- **Expected (CORRECT)**: `https://grid4openai.openai.azure.com/openai/v1/responses?api-version=preview`
-
-### Why This Happens
-
-1. When GPT-5 and o-series models use the Responses API (due to having `LLM_IF_OAI_Responses` interface), the system calls `openAIAccess(access, model.id, '/v1/responses')`
-
-2. The Azure case in `openAIAccess` doesn't distinguish between different `/v1/` endpoints
-
-3. The Responses API in Azure doesn't use the `/deployments/{deployment-id}/` path structure - it's a direct endpoint at `/openai/v1/responses`
-
-### Historical Context
-
-Interestingly, commit `1d0a76cd` by Paul Short on Aug 8, 2025 actually attempted to fix this with special handling for GPT-5:
-
-```typescript
-if (apiPath === '/v1/responses' && modelRefId?.toLowerCase().includes('gpt-5')) {
-  url += `/openai/v1/responses?api-version=preview`;
-}
-```
-
-However, this fix either:
- Was never merged to the main branch
- Was reverted
- Exists in a different branch
-
-The current v2-dev branch doesn't include this fix.
-
-## The Solution
-
-### Primary Fix: Correct URL Construction for Responses API
-
-**File**: `src/modules/llms/server/openai/openai.router.ts` (around line 538)
-
-**Current Code:**
-```typescript
-let url = azureHost;
-if (apiPath.startsWith('/v1/')) {
-  if (!modelRefId)
-    throw new Error('Azure OpenAI API needs a deployment id');
-  url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
-} else if (apiPath.startsWith('/openai/deployments'))
-  url += apiPath;
-else
-  throw new Error('Azure OpenAI API path not supported: ' + apiPath);
-```
-
-**Fixed Code:**
-```typescript
-let url = azureHost;
-// Special handling for Azure Responses API which doesn't use deployment paths
-if (apiPath === '/v1/responses') {
-  url += `/openai/v1/responses?api-version=preview`;
-} else if (apiPath.startsWith('/v1/')) {
-  if (!modelRefId)
-    throw new Error('Azure OpenAI API needs a deployment id');
-  url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
-} else if (apiPath.startsWith('/openai/deployments'))
-  url += apiPath;
-else
-  throw new Error('Azure OpenAI API path not supported: ' + apiPath);
-```
-
-### Alternative Solution: More Robust Pattern
-
-For better maintainability, consider defining Azure API patterns explicitly:
-
-```typescript
-let url = azureHost;
-
-// Define Azure-specific endpoint patterns
-const azureDirectEndpoints = {
-  '/v1/responses': '/openai/v1/responses?api-version=preview',
-  // Add other direct endpoints here as needed
-};
-
-// Check for direct endpoints first
-if (azureDirectEndpoints[apiPath]) {
-  url += azureDirectEndpoints[apiPath];
-} else if (apiPath.startsWith('/v1/')) {
-  // Deployment-based endpoints
-  if (!modelRefId)
-    throw new Error('Azure OpenAI API needs a deployment id');
-  url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
-} else if (apiPath.startsWith('/openai/deployments')) {
-  url += apiPath;
-} else {
-  throw new Error('Azure OpenAI API path not supported: ' + apiPath);
-}
-```
-
-## Secondary Issues to Address
-
-### 1. API Version Consistency
-
-The code uses different API versions in different places:
- Model listing: `2023-03-15-preview` (line 166)
- Chat completions: `2025-02-01-preview`
- Responses API: Should use `preview` or `2025-04-01-preview`
-
-Consider updating the model listing to use a more recent API version:
-```typescript
-// Line 166
-const azureOpenAIDeploymentsResponse = await openaiGETOrThrow(access, `/openai/deployments?api-version=2025-02-01-preview`);
-```
-
-### 2. Request Body Compatibility
-
-Ensure that the request body generated by `aixToOpenAIResponses` is compatible with Azure's Responses API implementation, which may have slight differences from OpenAI's native implementation.
-
-## Testing Requirements
-
-1. **Test with GPT-5 models** on Azure to verify Responses API works
-2. **Test with o3 Pro models** on Azure 
-3. **Verify existing models** (o4 Mini, GPT-4.1) continue working
-4. **Test both streaming and non-streaming** responses
-5. **Verify model listing** still works with updated API version
-
-## Implementation Steps
-
-1. **Immediate Fix**:
-   - Apply the primary fix to correctly construct Responses API URLs
-   - Test with affected models
-
-2. **Follow-up**:
-   - Update API versions for consistency
-   - Add logging to track which API endpoints are being used
-   - Consider adding configuration to toggle between APIs if needed
-
-3. **Long-term**:
-   - Monitor Azure OpenAI documentation for API changes
-   - Consider abstracting Azure-specific logic into a separate module
-   - Add integration tests for Azure endpoints
-
-## Conclusion
-
-The issue is a straightforward URL construction bug where the Azure Responses API endpoint is being incorrectly formatted as a deployment-based endpoint. The fix is simple: add special handling for the `/v1/responses` path to use the direct endpoint format instead of the deployment-based format. This aligns with Azure's documented API structure and will resolve the "Resource Not Found" errors users are experiencing with GPT-5 and o-series models.