Files
big-agi/docs/2024-AI-APIs-Comparison.md
T
Enrico Ros f052963da3 Md cleanup
2026-03-24 11:53:01 -07:00

75 lines
16 KiB
Markdown

---
unlisted: true
---
# AIX dispatch server - API features comparison
This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI.
The comparison covers a wide range of features, including function calling, vision, system instructions, etc.
| Feature Category | Specific Feature | Anthropic | Gemini | OpenAI |
|------------------------------------------|-------------------------------|--------------------------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------|
| **Message Structure** |
| | Role types | user, assistant | user, model | user, assistant, system, tool |
| | Named participants | No | No | Yes |
| | Content array | Yes | Yes | Yes |
| **Content Types and Multimodal Support** |
| | Text generation | Yes | Yes | Yes |
| | Image understanding | Yes | Yes | Yes |
| | Audio processing | No | **Yes** | No |
| | Video processing | No | **Yes** | No |
| **Image Handling** |
| | Supported formats | JPEG, PNG, GIF, WebP | JPEG, PNG, WebP, HEIC, HEIF | PNG, JPEG, WebP, non-animated GIF |
| | Max image size | 5MB per image | (20MB per prompt) | 20MB per image |
| | Image detail level | N/A | N/A | **Low, high, auto** |
| | Image resolution | max: 1568x1568 | min: 768x768, max: 3072x3072 | min: 512x512, max: 2048 x 2048 |
| | Token calculation for images | (width * height)/750; max 1,600 | 258 tokens | 85 + 170 * {patches} |
| | Image retention | Deleted after processing | Not specified | Deleted after processing |
| **Audio and Video Handling** |
| | Audio formats | N/A | WAV, MP3, AIFF, AAC, OGG, FLAC | N/A |
| | Video formats | N/A | MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP | N/A |
| **System Instructions and Tool Use** |
| | System instructions | Yes (array of text blocks) | Yes (parts array) | Yes (as system message) |
| **Function/Tool Handling** |
| | Parallel tool calls | No | No | **Yes** |
| | Tool Declaration | Defined in `tools` array | Defined in `tools` array | Defined in `tools` array |
| | FC name restrictions | Yes | Yes (max 63 chars) | Yes (max 64 chars) |
| | FC declaration | name, description, input_schema | name, description, parameters | name, description, parameters |
| | FC options structure | JSON Schema for input | Object with properties | JSON Schema for parameters |
| | FC Force invocation | Via `tool_choice` parameter | Via `toolConfig` parameter | Via `tool_choice` parameter |
| | FC Model invocation | Model generates a `tool_use` block with predicted parameters | Generates a `functionCall` part with predicted parameters | Generates a message.`tool_calls` item with predicted arguments |
| | FC Execution | Client-side | Client-side | Client-side |
| | FC Result injection | Client appends a `user` message with a `tool_result` content block | Client appends a `function` message with `functionResponse` part | Client sends a new `tool` message with `tool_call_id` and `content` |
| | Built-in Code execution | No | **Yes** | No |
| | Tool use with vision | Yes | Yes | Yes |
| **Generation Configuration** |
| | temperature | Yes | Yes | Yes |
| | max_tokens | Yes | Yes | Yes |
| | stop_sequences | Yes | Yes | Yes |
| | top_k | Yes | Yes | **No** |
| | top_p | Yes | Yes | Yes |
| | seed | No | No | **Yes** |
| | Multiple candidates | No | No | Yes (with 'n' parameter, breaks streaming?) |
| **Streaming and Response Structure** |
| | Streaming support | Yes | Yes | Yes |
| | Streaming initiation | stream=true | streamGenerateContent path | stream=true |
| | Streaming event types | **Multiple specific types** | Not specified | Single delta type |
| | Response container | content (array) | candidates (array) | choices (array) |
| **Usage Metrics and Error Handling** |
| | Token counts | Yes | Yes | Yes |
| | Detailed token breakdown | input, output | prompt, cached, candidates, total | prompt, completion, total |
| | Usage in stream | No | No | **Optional** |
| | Error handling in response | Not specified | Not specified | **Yes (undocumented)** |
| | Error handling in stream | Not specified | Not specified | **Yes (undocumented)** |
| **Advanced Features** |
| | JSON mode | **Partial (via structured prompts)** | **Yes (responseMimeType)** | **Yes** |
| | Output consistency techniques | **Yes (multiple methods)** | Not specified | Not specified |
| | Logprobs | No | No | **Yes (disabled in schema)** |
| | System fingerprint | No | No | **Yes** |
| | Semantic caching | No | **Yes** | No |
| | Assistant prefill | **Yes** | No | No |
| | Preferred formatting | **XML tags, JSON** | Not specified | Markdown |
| **Safety and Compliance** |
| | Safety settings in request | **Stop sequences** | **Detailed category-based** | **Moderation API** |
| | Safety feedback in response | Yes | Yes | Not specified |