Azure: remove description of the fix for #828, now it's merged

This commit is contained in:
Enrico Ros
2025-09-12 14:06:13 -07:00
parent aa441b0656
commit a4ad1e8295
3 changed files with 0 additions and 386 deletions
-188
View File
@@ -1,188 +0,0 @@
# Azure OpenAI Fix Implementation - Issue #828
## Summary
This implementation fixes the Azure OpenAI "Resource Not Found" errors for newly added models (GPT-5, o3 Pro) by addressing two critical issues:
1. **Host Normalization**: Prevents malformed URLs when client configuration includes paths/queries
2. **API Paradigm Support**: Properly handles Azure's next-generation v1 Responses API
## Changes Made
### 1. Core Router Fix (`src/modules/llms/server/openai/openai.router.ts`)
#### Host Normalization
- **Precedence Change**: Server environment variable (`AZURE_OPENAI_API_ENDPOINT`) now takes precedence over client-provided host
- **URL Sanitization**: Strips any path/query from the host URL to prevent malformed concatenations
- **Error Handling**: Added proper URL validation with clear error messages
#### API Paradigm Support
- **Dual Mode Support**: Implemented support for both traditional (deployment-based) and next-gen v1 APIs
- **Smart Routing**: Special handling for `/v1/responses` endpoint based on configuration
- **Configurable**: Can switch between paradigms via environment variables
#### API Version Management
- Centralized API version constants with environment variable overrides:
- `AZURE_RESPONSES_API_VERSION` (default: 'preview' for v1 API)
- `AZURE_CHAT_API_VERSION` (default: '2025-02-01-preview')
- `AZURE_DEPLOYMENTS_API_VERSION` (default: '2023-03-15-preview')
- Added `AZURE_API_V1` flag to explicitly enable next-gen v1 API
### 2. Environment Configuration (`src/server/env.ts`)
Added new optional environment variables:
```typescript
AZURE_API_V1: z.string().optional(), // 'true' to enable next-gen v1 API
AZURE_RESPONSES_API_VERSION: z.string().optional(), // 'preview' for v1
AZURE_CHAT_API_VERSION: z.string().optional(),
AZURE_DEPLOYMENTS_API_VERSION: z.string().optional(),
```
## How It Works
### URL Construction Logic
The fixed Azure case now follows this logic:
1. **Get and normalize the base URL**:
```typescript
const azureHostRaw = env.AZURE_OPENAI_API_ENDPOINT || access.oaiHost || '';
const azureBase = new URL(fixupHost(azureHostRaw, apiPath)).origin;
```
2. **Determine API paradigm**:
```typescript
const useV1API = AZURE_API_V1_ENABLED || AZURE_RESPONSES_API_VERSION === 'preview';
```
3. **Route based on endpoint and paradigm**:
- **Responses API (v1)**: `/openai/v1/responses?api-version=preview`
- **Responses API (traditional)**: `/openai/deployments/{deployment}/responses?api-version=2025-04-01-preview`
- **Chat Completions**: `/openai/deployments/{deployment}/chat/completions?api-version=2025-02-01-preview`
- **Model Listing**: `/openai/deployments?api-version=2023-03-15-preview`
### Configuration Options
#### Default Configuration (Recommended)
Uses next-gen v1 API for Responses (Azure's recommended approach):
```env
# No additional configuration needed - defaults to v1 for Responses
```
#### Traditional API Mode
Force traditional deployment-based API for all endpoints:
```env
AZURE_RESPONSES_API_VERSION=2025-04-01-preview
```
#### Explicit v1 Mode
Enable v1 API explicitly:
```env
AZURE_API_V1=true
```
#### Custom API Versions
Override specific API versions:
```env
AZURE_CHAT_API_VERSION=2025-02-01-preview
AZURE_DEPLOYMENTS_API_VERSION=2025-02-01-preview
```
## Testing Results
Tested with Azure OpenAI endpoint `https://grid4openai.openai.azure.com`:
| Endpoint | Status | Notes |
|----------|--------|-------|
| List Deployments | ✅ Success | Found 21 deployments |
| Chat Completions (Traditional) | ✅ Success* | *GPT-5 requires `max_completion_tokens` instead of `max_tokens` |
| Responses API (v1) | ✅ Success | Works with `/openai/v1/responses?api-version=preview` |
| Responses API (Traditional) | ❌ 404 | Azure doesn't support deployment-based Responses API |
## Migration Guide
### For Users
1. **No action required** if using server environment variables (Vercel, etc.)
2. **If using client configuration**: Ensure the Azure endpoint field contains only the base URL (e.g., `https://yourinstance.openai.azure.com`) without any paths or query parameters
### For Developers
1. The fix defaults to using the next-gen v1 API for Responses
2. Traditional chat/completions continue to use deployment-based endpoints
3. Add console logging is included to debug which API paradigm is being used
4. Environment variables provide full control over API versions and paradigms
## Known Limitations
### Azure OpenAI Specific Limitations
1. **Web Search Tool**: Azure OpenAI doesn't currently support the `web_search_preview` tool
- The fix automatically disables web search for Azure deployments
- Logs when web search is skipped: `[Azure] Skipping web_search_preview tool - not supported on Azure OpenAI`
- This affects all Azure models using the Responses API (GPT-5, o-series)
2. **GPT-5 Model Constraints**:
- No temperature control (only default value of 1.0 supported)
- Must use `max_completion_tokens` instead of `max_tokens`
- These constraints are already handled in the existing code
3. **Image Generation**: Multi-turn editing and streaming not yet supported on Azure
4. **File Upload**: Images can't be uploaded as files and referenced as input (coming soon)
## Benefits
1. **Fixes Issue #828**: Resolves "Resource Not Found" errors for GPT-5 and o3 Pro models
2. **Handles Azure Limitations**: Automatically disables unsupported features like web search
3. **Future-Proof**: Supports Azure's next-generation v1 API
4. **Backward Compatible**: Maintains support for existing deployments
5. **Robust**: Prevents malformed URLs from client misconfigurations
6. **Configurable**: Full control over API paradigms and versions
7. **Debuggable**: Includes logging for troubleshooting
## Technical Details
### Why Two API Paradigms?
Azure OpenAI offers two API styles:
1. **Traditional (Deployment-based)**:
- Path: `/openai/deployments/{deployment-name}/{operation}`
- Used for: Chat completions, embeddings, etc.
- API Version: Dated versions like `2025-02-01-preview`
2. **Next-Generation v1**:
- Path: `/openai/v1/{operation}`
- Used for: Responses API (new)
- API Version: `preview` (always latest)
- Benefits: Simpler, more OpenAI-compatible
### The Root Cause
The original code treated all `/v1/*` endpoints identically, converting:
- `/v1/responses` → `/openai/deployments/{model}/responses`
But Azure's Responses API requires:
- `/v1/responses` → `/openai/v1/responses` (no deployment in path)
### The Solution
1. **Detect Responses API**: Check if `apiPath === '/v1/responses'`
2. **Apply correct routing**: Use v1 pattern for Responses, traditional for others
3. **Normalize hosts**: Prevent path/query pollution in base URLs
4. **Prioritize server config**: Trust server environment over client input
## Related Files
- `src/modules/llms/server/openai/openai.router.ts`: Main fix implementation
- `src/server/env.ts`: Environment variable definitions
- `test-azure-api.js`: Validation test script
- `docs/azure-openai-issue-828-corrected-analysis.md`: Detailed analysis
- `docs/azure-openai-fix-implementation.md`: This document
## References
- GitHub Issue: https://github.com/enricoros/big-AGI/issues/828
- Azure Responses API: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
- Azure API Evolution: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle
@@ -1,171 +0,0 @@
# Azure OpenAI "Resource Not Found" Issue #828 - Corrected Analysis
## Executive Summary
GitHub issue #828 reports that newly added Azure OpenAI model endpoints (GPT-5, o3 Pro) are failing with "Resource Not Found" errors. The root cause is that the big-AGI codebase incorrectly constructs Azure OpenAI Responses API URLs by treating them like deployment-based endpoints when they should use a different path structure.
## Key Facts
1. **Azure DOES support the Responses API** - Documented at:
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle#api-evolution
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/reference-preview-latest#create-response
2. **User configuration is correct** - The base endpoint `https://grid4openai.openai.azure.com/` is properly configured in Vercel
3. **Models work in Azure AI Foundry** - The deployments function correctly in Azure's playground and other compatible applications
## Root Cause Analysis
### The URL Construction Bug
In `src/modules/llms/server/openai/openai.router.ts` (lines 539-542), the code incorrectly handles the Azure Responses API endpoint:
```typescript
if (apiPath.startsWith('/v1/')) {
if (!modelRefId)
throw new Error('Azure OpenAI API needs a deployment id');
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
}
```
This code treats ALL `/v1/` paths identically, resulting in:
**For `/v1/responses`:**
- **Current (WRONG)**: `https://grid4openai.openai.azure.com/openai/deployments/{model-id}/responses?api-version=2025-02-01-preview`
- **Expected (CORRECT)**: `https://grid4openai.openai.azure.com/openai/v1/responses?api-version=preview`
### Why This Happens
1. When GPT-5 and o-series models use the Responses API (due to having `LLM_IF_OAI_Responses` interface), the system calls `openAIAccess(access, model.id, '/v1/responses')`
2. The Azure case in `openAIAccess` doesn't distinguish between different `/v1/` endpoints
3. The Responses API in Azure doesn't use the `/deployments/{deployment-id}/` path structure - it's a direct endpoint at `/openai/v1/responses`
### Historical Context
Interestingly, commit `1d0a76cd` by Paul Short on Aug 8, 2025 actually attempted to fix this with special handling for GPT-5:
```typescript
if (apiPath === '/v1/responses' && modelRefId?.toLowerCase().includes('gpt-5')) {
url += `/openai/v1/responses?api-version=preview`;
}
```
However, this fix either:
- Was never merged to the main branch
- Was reverted
- Exists in a different branch
The current v2-dev branch doesn't include this fix.
## The Solution
### Primary Fix: Correct URL Construction for Responses API
**File**: `src/modules/llms/server/openai/openai.router.ts` (around line 538)
**Current Code:**
```typescript
let url = azureHost;
if (apiPath.startsWith('/v1/')) {
if (!modelRefId)
throw new Error('Azure OpenAI API needs a deployment id');
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
} else if (apiPath.startsWith('/openai/deployments'))
url += apiPath;
else
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
```
**Fixed Code:**
```typescript
let url = azureHost;
// Special handling for Azure Responses API which doesn't use deployment paths
if (apiPath === '/v1/responses') {
url += `/openai/v1/responses?api-version=preview`;
} else if (apiPath.startsWith('/v1/')) {
if (!modelRefId)
throw new Error('Azure OpenAI API needs a deployment id');
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
} else if (apiPath.startsWith('/openai/deployments'))
url += apiPath;
else
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
```
### Alternative Solution: More Robust Pattern
For better maintainability, consider defining Azure API patterns explicitly:
```typescript
let url = azureHost;
// Define Azure-specific endpoint patterns
const azureDirectEndpoints = {
'/v1/responses': '/openai/v1/responses?api-version=preview',
// Add other direct endpoints here as needed
};
// Check for direct endpoints first
if (azureDirectEndpoints[apiPath]) {
url += azureDirectEndpoints[apiPath];
} else if (apiPath.startsWith('/v1/')) {
// Deployment-based endpoints
if (!modelRefId)
throw new Error('Azure OpenAI API needs a deployment id');
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
} else if (apiPath.startsWith('/openai/deployments')) {
url += apiPath;
} else {
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
}
```
## Secondary Issues to Address
### 1. API Version Consistency
The code uses different API versions in different places:
- Model listing: `2023-03-15-preview` (line 166)
- Chat completions: `2025-02-01-preview`
- Responses API: Should use `preview` or `2025-04-01-preview`
Consider updating the model listing to use a more recent API version:
```typescript
// Line 166
const azureOpenAIDeploymentsResponse = await openaiGETOrThrow(access, `/openai/deployments?api-version=2025-02-01-preview`);
```
### 2. Request Body Compatibility
Ensure that the request body generated by `aixToOpenAIResponses` is compatible with Azure's Responses API implementation, which may have slight differences from OpenAI's native implementation.
## Testing Requirements
1. **Test with GPT-5 models** on Azure to verify Responses API works
2. **Test with o3 Pro models** on Azure
3. **Verify existing models** (o4 Mini, GPT-4.1) continue working
4. **Test both streaming and non-streaming** responses
5. **Verify model listing** still works with updated API version
## Implementation Steps
1. **Immediate Fix**:
- Apply the primary fix to correctly construct Responses API URLs
- Test with affected models
2. **Follow-up**:
- Update API versions for consistency
- Add logging to track which API endpoints are being used
- Consider adding configuration to toggle between APIs if needed
3. **Long-term**:
- Monitor Azure OpenAI documentation for API changes
- Consider abstracting Azure-specific logic into a separate module
- Add integration tests for Azure endpoints
## Conclusion
The issue is a straightforward URL construction bug where the Azure Responses API endpoint is being incorrectly formatted as a deployment-based endpoint. The fix is simple: add special handling for the `/v1/responses` path to use the direct endpoint format instead of the deployment-based format. This aligns with Azure's documented API structure and will resolve the "Resource Not Found" errors users are experiencing with GPT-5 and o-series models.