mirror of
https://github.com/enricoros/big-AGI.git
synced 2026-05-10 21:50:14 -07:00
Azure: remove description of the fix for #828, now it's merged
This commit is contained in:
@@ -1,188 +0,0 @@
|
||||
# Azure OpenAI Fix Implementation - Issue #828
|
||||
|
||||
## Summary
|
||||
|
||||
This implementation fixes the Azure OpenAI "Resource Not Found" errors for newly added models (GPT-5, o3 Pro) by addressing two critical issues:
|
||||
|
||||
1. **Host Normalization**: Prevents malformed URLs when client configuration includes paths/queries
|
||||
2. **API Paradigm Support**: Properly handles Azure's next-generation v1 Responses API
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Core Router Fix (`src/modules/llms/server/openai/openai.router.ts`)
|
||||
|
||||
#### Host Normalization
|
||||
- **Precedence Change**: Server environment variable (`AZURE_OPENAI_API_ENDPOINT`) now takes precedence over client-provided host
|
||||
- **URL Sanitization**: Strips any path/query from the host URL to prevent malformed concatenations
|
||||
- **Error Handling**: Added proper URL validation with clear error messages
|
||||
|
||||
#### API Paradigm Support
|
||||
- **Dual Mode Support**: Implemented support for both traditional (deployment-based) and next-gen v1 APIs
|
||||
- **Smart Routing**: Special handling for `/v1/responses` endpoint based on configuration
|
||||
- **Configurable**: Can switch between paradigms via environment variables
|
||||
|
||||
#### API Version Management
|
||||
- Centralized API version constants with environment variable overrides:
|
||||
- `AZURE_RESPONSES_API_VERSION` (default: 'preview' for v1 API)
|
||||
- `AZURE_CHAT_API_VERSION` (default: '2025-02-01-preview')
|
||||
- `AZURE_DEPLOYMENTS_API_VERSION` (default: '2023-03-15-preview')
|
||||
- Added `AZURE_API_V1` flag to explicitly enable next-gen v1 API
|
||||
|
||||
### 2. Environment Configuration (`src/server/env.ts`)
|
||||
|
||||
Added new optional environment variables:
|
||||
```typescript
|
||||
AZURE_API_V1: z.string().optional(), // 'true' to enable next-gen v1 API
|
||||
AZURE_RESPONSES_API_VERSION: z.string().optional(), // 'preview' for v1
|
||||
AZURE_CHAT_API_VERSION: z.string().optional(),
|
||||
AZURE_DEPLOYMENTS_API_VERSION: z.string().optional(),
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
### URL Construction Logic
|
||||
|
||||
The fixed Azure case now follows this logic:
|
||||
|
||||
1. **Get and normalize the base URL**:
|
||||
```typescript
|
||||
const azureHostRaw = env.AZURE_OPENAI_API_ENDPOINT || access.oaiHost || '';
|
||||
const azureBase = new URL(fixupHost(azureHostRaw, apiPath)).origin;
|
||||
```
|
||||
|
||||
2. **Determine API paradigm**:
|
||||
```typescript
|
||||
const useV1API = AZURE_API_V1_ENABLED || AZURE_RESPONSES_API_VERSION === 'preview';
|
||||
```
|
||||
|
||||
3. **Route based on endpoint and paradigm**:
|
||||
- **Responses API (v1)**: `/openai/v1/responses?api-version=preview`
|
||||
- **Responses API (traditional)**: `/openai/deployments/{deployment}/responses?api-version=2025-04-01-preview`
|
||||
- **Chat Completions**: `/openai/deployments/{deployment}/chat/completions?api-version=2025-02-01-preview`
|
||||
- **Model Listing**: `/openai/deployments?api-version=2023-03-15-preview`
|
||||
|
||||
### Configuration Options
|
||||
|
||||
#### Default Configuration (Recommended)
|
||||
Uses next-gen v1 API for Responses (Azure's recommended approach):
|
||||
```env
|
||||
# No additional configuration needed - defaults to v1 for Responses
|
||||
```
|
||||
|
||||
#### Traditional API Mode
|
||||
Force traditional deployment-based API for all endpoints:
|
||||
```env
|
||||
AZURE_RESPONSES_API_VERSION=2025-04-01-preview
|
||||
```
|
||||
|
||||
#### Explicit v1 Mode
|
||||
Enable v1 API explicitly:
|
||||
```env
|
||||
AZURE_API_V1=true
|
||||
```
|
||||
|
||||
#### Custom API Versions
|
||||
Override specific API versions:
|
||||
```env
|
||||
AZURE_CHAT_API_VERSION=2025-02-01-preview
|
||||
AZURE_DEPLOYMENTS_API_VERSION=2025-02-01-preview
|
||||
```
|
||||
|
||||
## Testing Results
|
||||
|
||||
Tested with Azure OpenAI endpoint `https://grid4openai.openai.azure.com`:
|
||||
|
||||
| Endpoint | Status | Notes |
|
||||
|----------|--------|-------|
|
||||
| List Deployments | ✅ Success | Found 21 deployments |
|
||||
| Chat Completions (Traditional) | ✅ Success* | *GPT-5 requires `max_completion_tokens` instead of `max_tokens` |
|
||||
| Responses API (v1) | ✅ Success | Works with `/openai/v1/responses?api-version=preview` |
|
||||
| Responses API (Traditional) | ❌ 404 | Azure doesn't support deployment-based Responses API |
|
||||
|
||||
## Migration Guide
|
||||
|
||||
### For Users
|
||||
|
||||
1. **No action required** if using server environment variables (Vercel, etc.)
|
||||
2. **If using client configuration**: Ensure the Azure endpoint field contains only the base URL (e.g., `https://yourinstance.openai.azure.com`) without any paths or query parameters
|
||||
|
||||
### For Developers
|
||||
|
||||
1. The fix defaults to using the next-gen v1 API for Responses
|
||||
2. Traditional chat/completions continue to use deployment-based endpoints
|
||||
3. Add console logging is included to debug which API paradigm is being used
|
||||
4. Environment variables provide full control over API versions and paradigms
|
||||
|
||||
## Known Limitations
|
||||
|
||||
### Azure OpenAI Specific Limitations
|
||||
|
||||
1. **Web Search Tool**: Azure OpenAI doesn't currently support the `web_search_preview` tool
|
||||
- The fix automatically disables web search for Azure deployments
|
||||
- Logs when web search is skipped: `[Azure] Skipping web_search_preview tool - not supported on Azure OpenAI`
|
||||
- This affects all Azure models using the Responses API (GPT-5, o-series)
|
||||
|
||||
2. **GPT-5 Model Constraints**:
|
||||
- No temperature control (only default value of 1.0 supported)
|
||||
- Must use `max_completion_tokens` instead of `max_tokens`
|
||||
- These constraints are already handled in the existing code
|
||||
|
||||
3. **Image Generation**: Multi-turn editing and streaming not yet supported on Azure
|
||||
|
||||
4. **File Upload**: Images can't be uploaded as files and referenced as input (coming soon)
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Fixes Issue #828**: Resolves "Resource Not Found" errors for GPT-5 and o3 Pro models
|
||||
2. **Handles Azure Limitations**: Automatically disables unsupported features like web search
|
||||
3. **Future-Proof**: Supports Azure's next-generation v1 API
|
||||
4. **Backward Compatible**: Maintains support for existing deployments
|
||||
5. **Robust**: Prevents malformed URLs from client misconfigurations
|
||||
6. **Configurable**: Full control over API paradigms and versions
|
||||
7. **Debuggable**: Includes logging for troubleshooting
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Why Two API Paradigms?
|
||||
|
||||
Azure OpenAI offers two API styles:
|
||||
|
||||
1. **Traditional (Deployment-based)**:
|
||||
- Path: `/openai/deployments/{deployment-name}/{operation}`
|
||||
- Used for: Chat completions, embeddings, etc.
|
||||
- API Version: Dated versions like `2025-02-01-preview`
|
||||
|
||||
2. **Next-Generation v1**:
|
||||
- Path: `/openai/v1/{operation}`
|
||||
- Used for: Responses API (new)
|
||||
- API Version: `preview` (always latest)
|
||||
- Benefits: Simpler, more OpenAI-compatible
|
||||
|
||||
### The Root Cause
|
||||
|
||||
The original code treated all `/v1/*` endpoints identically, converting:
|
||||
- `/v1/responses` → `/openai/deployments/{model}/responses`
|
||||
|
||||
But Azure's Responses API requires:
|
||||
- `/v1/responses` → `/openai/v1/responses` (no deployment in path)
|
||||
|
||||
### The Solution
|
||||
|
||||
1. **Detect Responses API**: Check if `apiPath === '/v1/responses'`
|
||||
2. **Apply correct routing**: Use v1 pattern for Responses, traditional for others
|
||||
3. **Normalize hosts**: Prevent path/query pollution in base URLs
|
||||
4. **Prioritize server config**: Trust server environment over client input
|
||||
|
||||
## Related Files
|
||||
|
||||
- `src/modules/llms/server/openai/openai.router.ts`: Main fix implementation
|
||||
- `src/server/env.ts`: Environment variable definitions
|
||||
- `test-azure-api.js`: Validation test script
|
||||
- `docs/azure-openai-issue-828-corrected-analysis.md`: Detailed analysis
|
||||
- `docs/azure-openai-fix-implementation.md`: This document
|
||||
|
||||
## References
|
||||
|
||||
- GitHub Issue: https://github.com/enricoros/big-AGI/issues/828
|
||||
- Azure Responses API: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
|
||||
- Azure API Evolution: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle
|
||||
@@ -1,171 +0,0 @@
|
||||
# Azure OpenAI "Resource Not Found" Issue #828 - Corrected Analysis
|
||||
|
||||
## Executive Summary
|
||||
|
||||
GitHub issue #828 reports that newly added Azure OpenAI model endpoints (GPT-5, o3 Pro) are failing with "Resource Not Found" errors. The root cause is that the big-AGI codebase incorrectly constructs Azure OpenAI Responses API URLs by treating them like deployment-based endpoints when they should use a different path structure.
|
||||
|
||||
## Key Facts
|
||||
|
||||
1. **Azure DOES support the Responses API** - Documented at:
|
||||
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses
|
||||
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle#api-evolution
|
||||
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/reference-preview-latest#create-response
|
||||
|
||||
2. **User configuration is correct** - The base endpoint `https://grid4openai.openai.azure.com/` is properly configured in Vercel
|
||||
|
||||
3. **Models work in Azure AI Foundry** - The deployments function correctly in Azure's playground and other compatible applications
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### The URL Construction Bug
|
||||
|
||||
In `src/modules/llms/server/openai/openai.router.ts` (lines 539-542), the code incorrectly handles the Azure Responses API endpoint:
|
||||
|
||||
```typescript
|
||||
if (apiPath.startsWith('/v1/')) {
|
||||
if (!modelRefId)
|
||||
throw new Error('Azure OpenAI API needs a deployment id');
|
||||
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
|
||||
}
|
||||
```
|
||||
|
||||
This code treats ALL `/v1/` paths identically, resulting in:
|
||||
|
||||
**For `/v1/responses`:**
|
||||
- **Current (WRONG)**: `https://grid4openai.openai.azure.com/openai/deployments/{model-id}/responses?api-version=2025-02-01-preview`
|
||||
- **Expected (CORRECT)**: `https://grid4openai.openai.azure.com/openai/v1/responses?api-version=preview`
|
||||
|
||||
### Why This Happens
|
||||
|
||||
1. When GPT-5 and o-series models use the Responses API (due to having `LLM_IF_OAI_Responses` interface), the system calls `openAIAccess(access, model.id, '/v1/responses')`
|
||||
|
||||
2. The Azure case in `openAIAccess` doesn't distinguish between different `/v1/` endpoints
|
||||
|
||||
3. The Responses API in Azure doesn't use the `/deployments/{deployment-id}/` path structure - it's a direct endpoint at `/openai/v1/responses`
|
||||
|
||||
### Historical Context
|
||||
|
||||
Interestingly, commit `1d0a76cd` by Paul Short on Aug 8, 2025 actually attempted to fix this with special handling for GPT-5:
|
||||
|
||||
```typescript
|
||||
if (apiPath === '/v1/responses' && modelRefId?.toLowerCase().includes('gpt-5')) {
|
||||
url += `/openai/v1/responses?api-version=preview`;
|
||||
}
|
||||
```
|
||||
|
||||
However, this fix either:
|
||||
- Was never merged to the main branch
|
||||
- Was reverted
|
||||
- Exists in a different branch
|
||||
|
||||
The current v2-dev branch doesn't include this fix.
|
||||
|
||||
## The Solution
|
||||
|
||||
### Primary Fix: Correct URL Construction for Responses API
|
||||
|
||||
**File**: `src/modules/llms/server/openai/openai.router.ts` (around line 538)
|
||||
|
||||
**Current Code:**
|
||||
```typescript
|
||||
let url = azureHost;
|
||||
if (apiPath.startsWith('/v1/')) {
|
||||
if (!modelRefId)
|
||||
throw new Error('Azure OpenAI API needs a deployment id');
|
||||
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
|
||||
} else if (apiPath.startsWith('/openai/deployments'))
|
||||
url += apiPath;
|
||||
else
|
||||
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
|
||||
```
|
||||
|
||||
**Fixed Code:**
|
||||
```typescript
|
||||
let url = azureHost;
|
||||
// Special handling for Azure Responses API which doesn't use deployment paths
|
||||
if (apiPath === '/v1/responses') {
|
||||
url += `/openai/v1/responses?api-version=preview`;
|
||||
} else if (apiPath.startsWith('/v1/')) {
|
||||
if (!modelRefId)
|
||||
throw new Error('Azure OpenAI API needs a deployment id');
|
||||
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
|
||||
} else if (apiPath.startsWith('/openai/deployments'))
|
||||
url += apiPath;
|
||||
else
|
||||
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
|
||||
```
|
||||
|
||||
### Alternative Solution: More Robust Pattern
|
||||
|
||||
For better maintainability, consider defining Azure API patterns explicitly:
|
||||
|
||||
```typescript
|
||||
let url = azureHost;
|
||||
|
||||
// Define Azure-specific endpoint patterns
|
||||
const azureDirectEndpoints = {
|
||||
'/v1/responses': '/openai/v1/responses?api-version=preview',
|
||||
// Add other direct endpoints here as needed
|
||||
};
|
||||
|
||||
// Check for direct endpoints first
|
||||
if (azureDirectEndpoints[apiPath]) {
|
||||
url += azureDirectEndpoints[apiPath];
|
||||
} else if (apiPath.startsWith('/v1/')) {
|
||||
// Deployment-based endpoints
|
||||
if (!modelRefId)
|
||||
throw new Error('Azure OpenAI API needs a deployment id');
|
||||
url += `/openai/deployments/${modelRefId}/${apiPath.replace('/v1/', '')}?api-version=2025-02-01-preview`;
|
||||
} else if (apiPath.startsWith('/openai/deployments')) {
|
||||
url += apiPath;
|
||||
} else {
|
||||
throw new Error('Azure OpenAI API path not supported: ' + apiPath);
|
||||
}
|
||||
```
|
||||
|
||||
## Secondary Issues to Address
|
||||
|
||||
### 1. API Version Consistency
|
||||
|
||||
The code uses different API versions in different places:
|
||||
- Model listing: `2023-03-15-preview` (line 166)
|
||||
- Chat completions: `2025-02-01-preview`
|
||||
- Responses API: Should use `preview` or `2025-04-01-preview`
|
||||
|
||||
Consider updating the model listing to use a more recent API version:
|
||||
```typescript
|
||||
// Line 166
|
||||
const azureOpenAIDeploymentsResponse = await openaiGETOrThrow(access, `/openai/deployments?api-version=2025-02-01-preview`);
|
||||
```
|
||||
|
||||
### 2. Request Body Compatibility
|
||||
|
||||
Ensure that the request body generated by `aixToOpenAIResponses` is compatible with Azure's Responses API implementation, which may have slight differences from OpenAI's native implementation.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
1. **Test with GPT-5 models** on Azure to verify Responses API works
|
||||
2. **Test with o3 Pro models** on Azure
|
||||
3. **Verify existing models** (o4 Mini, GPT-4.1) continue working
|
||||
4. **Test both streaming and non-streaming** responses
|
||||
5. **Verify model listing** still works with updated API version
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
1. **Immediate Fix**:
|
||||
- Apply the primary fix to correctly construct Responses API URLs
|
||||
- Test with affected models
|
||||
|
||||
2. **Follow-up**:
|
||||
- Update API versions for consistency
|
||||
- Add logging to track which API endpoints are being used
|
||||
- Consider adding configuration to toggle between APIs if needed
|
||||
|
||||
3. **Long-term**:
|
||||
- Monitor Azure OpenAI documentation for API changes
|
||||
- Consider abstracting Azure-specific logic into a separate module
|
||||
- Add integration tests for Azure endpoints
|
||||
|
||||
## Conclusion
|
||||
|
||||
The issue is a straightforward URL construction bug where the Azure Responses API endpoint is being incorrectly formatted as a deployment-based endpoint. The fix is simple: add special handling for the `/v1/responses` path to use the direct endpoint format instead of the deployment-based format. This aligns with Azure's documented API structure and will resolve the "Resource Not Found" errors users are experiencing with GPT-5 and o-series models.
|
||||
Reference in New Issue
Block a user