Files
big-agi/docs/2024-AI-APIs-Comparison.md
T
2024-07-11 23:06:44 -07:00

16 KiB

AIX dispatch server - API features comparison

This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI. The comparison covers a wide range of features, including function calling, vision, system instructions, etc.

Feature Category Specific Feature Anthropic Gemini OpenAI
Message Structure
Role types user, assistant user, model user, assistant, system, tool
Named participants No No Yes
Content array Yes Yes Yes
Content Types and Multimodal Support
Text generation Yes Yes Yes
Image understanding Yes Yes Yes
Audio processing No Yes No
Video processing No Yes No
Image Handling
Supported formats JPEG, PNG, GIF, WebP JPEG, PNG, WebP, HEIC, HEIF PNG, JPEG, WebP, non-animated GIF
Max image size 5MB per image (20MB per prompt) 20MB per image
Image detail level N/A N/A Low, high, auto
Image resolution max: 1568x1568 min: 768x768, max: 3072x3072 min: 512x512, max: 2048 x 2048
Token calculation for images (width * height)/750; max 1,600 258 tokens 85 + 170 * {patches}
Image retention Deleted after processing Not specified Deleted after processing
Audio and Video Handling
Audio formats N/A WAV, MP3, AIFF, AAC, OGG, FLAC N/A
Video formats N/A MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP N/A
System Instructions and Tool Use
System instructions Yes (array of text blocks) Yes (parts array) Yes (as system message)
Function/Tool Handling
Parallel tool calls No No Yes
Tool Declaration Defined in tools array Defined in tools array Defined in tools array
FC name restrictions Yes Yes (max 63 chars) Yes (max 64 chars)
FC declaration name, description, input_schema name, description, parameters name, description, parameters
FC options structure JSON Schema for input Object with properties JSON Schema for parameters
FC Force invocation Via tool_choice parameter Via toolConfig parameter Via tool_choice parameter
FC Model invocation Model generates a tool_use block with predicted parameters Generates a functionCall part with predicted parameters Generates a message.tool_calls item with predicted arguments
FC Execution Client-side Client-side Client-side
FC Result injection Client appends a user message with a tool_result content block Client appends a function message with functionResponse part Client sends a new tool message with tool_call_id and content
Built-in Code execution No Yes No
Tool use with vision Yes Yes Yes
Generation Configuration
temperature Yes Yes Yes
max_tokens Yes Yes Yes
stop_sequences Yes Yes Yes
top_k Yes Yes No
top_p Yes Yes Yes
seed No No Yes
Multiple candidates No No Yes (with 'n' parameter, breaks streaming?)
Streaming and Response Structure
Streaming support Yes Yes Yes
Streaming initiation stream=true streamGenerateContent path stream=true
Streaming event types Multiple specific types Not specified Single delta type
Response container content (array) candidates (array) choices (array)
Usage Metrics and Error Handling
Token counts Yes Yes Yes
Detailed token breakdown input, output prompt, cached, candidates, total prompt, completion, total
Usage in stream No No Optional
Error handling in response Not specified Not specified Yes (undocumented)
Error handling in stream Not specified Not specified Yes (undocumented)
Advanced Features
JSON mode Partial (via structured prompts) Yes (responseMimeType) Yes
Output consistency techniques Yes (multiple methods) Not specified Not specified
Logprobs No No Yes (disabled in schema)
System fingerprint No No Yes
Semantic caching No Yes No
Assistant prefill Yes No No
Preferred formatting XML tags, JSON Not specified Markdown
Safety and Compliance
Safety settings in request Stop sequences Detailed category-based Moderation API
Safety feedback in response Yes Yes Not specified