mirror of
https://github.com/enricoros/big-AGI.git
synced 2026-05-10 21:50:14 -07:00
Compare commits
8 Commits
acdbb2fbaf
...
1bf1b744b9
| Author | SHA1 | Date | |
|---|---|---|---|
| 1bf1b744b9 | |||
| ee2d7114c7 | |||
| 3b1b54b3a3 | |||
| 524029a882 | |||
| 69161d29a7 | |||
| 8a542c1af4 | |||
| fe16970624 | |||
| e21abdef45 |
@@ -17,6 +17,10 @@ Architecture and system documentation is available in the `/kb/` knowledge base,
|
||||
#### CSF - Client-Side Fetch
|
||||
- **[CSF.md](systems/client-side-fetch.md)** - Direct browser-to-API communication for LLM requests
|
||||
|
||||
#### LLM - Language Model Metadata
|
||||
- **[LLM-editorial-control.md](modules/LLM-editorial-pubdate.md)** - Where we have editorial control over per-model metadata vs dynamic discovery; `pubDate` field semantics, propagation chain, resolution rules, per-vendor matrix
|
||||
- **[LLM-models-catalog-pipeline.md](modules/LLM-models-catalog-pipeline.md)** - Forward-looking pipeline: extraction script, snapshot artifact, website consumption, future schema extensions
|
||||
|
||||
### Systems Documentation
|
||||
|
||||
#### Core Platform Systems
|
||||
|
||||
@@ -0,0 +1,106 @@
|
||||
# LLM Editorial Control Surface
|
||||
|
||||
This document maps where Big-AGI has editorial control over per-model metadata (and therefore can guarantee fields like `pubDate`, curated `description`, `chatPrice`, `benchmark`, `parameterSpecs`, etc.) versus where it must rely on the vendor API's dynamic discovery (and therefore cannot guarantee them).
|
||||
|
||||
For the forward-looking pipeline (extraction script, snapshot, website consumption, future schema extensions), see [LLM-models-catalog-pipeline.md](LLM-models-catalog-pipeline.md).
|
||||
|
||||
|
||||
## The `pubDate` field
|
||||
|
||||
`pubDate?: string` (validated as `/^\d{8}$/`, e.g. `'20250929'`) is **optional** in the wire schema and on `DLLM`. It was added to:
|
||||
|
||||
- `ModelDescription_schema` in `src/modules/llms/server/llm.server.types.ts` - the canonical wire type
|
||||
- `OrtVendorLookupResult` in the same file - so OpenRouter inherits it via `llmOrt*Lookup`
|
||||
- `DLLM` in `src/common/stores/llms/llms.types.ts` - the persisted client model
|
||||
|
||||
### Where `pubDate` is guaranteed (always emitted)
|
||||
|
||||
- **Editorial entries** in 12 hybrid/editorial vendors (282 models). Hand-curated, externally corroborated. Future entries in these arrays are expected to include `pubDate`.
|
||||
- **Anthropic 0-day placeholder** (`llmsAntCreatePlaceholderModel`): when the API surfaces an Anthropic model not in the editorial list, the placeholder uses the API's `created_at` ISO date, falling back to today via `formatPubDate()`.
|
||||
- **Gemini 0-day fallback** (`geminiModelToModelDescription`): when the API returns a Gemini model not in `_knownGeminiModels`, the converter falls back to today via `formatPubDate()` (Gemini API does not expose a creation timestamp).
|
||||
|
||||
### Where `pubDate` is omitted (optional)
|
||||
|
||||
- **Symlink entries** (`KnownLink`) - inherit the target's `pubDate` via the merge logic in `fromManualMapping`.
|
||||
- **Unknown variants resolved through `super`/`fallback`** in `fromManualMapping` for non-Anthropic/non-Gemini vendors - the field is left undefined rather than fabricated.
|
||||
- **Dynamic-only vendors** (OpenRouter, TogetherAI, Novita, ChutesAI, FireworksAI, TLUS, Azure, LM Studio, LocalAI, FastAPI, ArceeAI, LLMAPI) - no editorial knob; pubDate flows in only when the underlying lookup or upstream API populates it.
|
||||
|
||||
The rationale: today's date is a defensible 0-day proxy only when we know we're seeing a brand-new model the vendor just announced (Anthropic and Gemini's "discovery via official model list" paths). For arbitrary dynamic vendors, fabricating today would mark old/well-known models as new - misleading. Better to omit.
|
||||
|
||||
### Propagation chain
|
||||
|
||||
- `fromManualMapping()` in `src/modules/llms/server/models.mappings.ts` - copies the field for OAI-style vendors when present
|
||||
- `geminiModelToModelDescription()` in `src/modules/llms/server/gemini/gemini.models.ts` - copies for Gemini, falls back to today for unknowns
|
||||
- `llmsAntCreatePlaceholderModel()` in `src/modules/llms/server/anthropic/anthropic.models.ts` - emits from API `created_at` (or today)
|
||||
- `_mergeLookup()` in `src/modules/llms/server/openai/models/openrouter.models.ts` - merges for OpenRouter cross-vendor inheritance
|
||||
- `_createDLLMFromModelDescription()` in `src/modules/llms/llm.client.ts` - copies onto the persisted DLLM when present
|
||||
- `formatPubDate()` helper in `src/modules/llms/server/models.mappings.ts` - shared `'YYYYMMDD'` formatter for the 0-day-fillable paths
|
||||
|
||||
### Semantics
|
||||
|
||||
`pubDate` is the **earliest public availability** of the model - the date on which the vendor first made this specific model usable by external users via any channel (consumer app, web, console, API, partner, open-weights upload).
|
||||
|
||||
It is **not**:
|
||||
|
||||
- The date Big-AGI added the entry to its catalog (Ollama uses `added` for that)
|
||||
- The training-data cutoff (proposed but not implemented; see `src/common/stores/llms/llms.types.next.ts:217`)
|
||||
- The date the model snapshot was built (suffixes like `-1212` may refer to build dates, but `pubDate` tracks public availability)
|
||||
|
||||
### Resolution rules (when sources conflict)
|
||||
|
||||
1. **Date-suffixed model IDs**: when the suffix matches a documented announcement, the suffix is canonical (vendor convention). xAI, OpenAI, and Mistral all use suffixes that closely track release dates.
|
||||
2. **Anthropic exception**: Anthropic's date suffixes are typically the **snapshot/training-cutoff date, not the public release date**. For example, `claude-3-7-sonnet-20250219` was released on 2025-02-24, `claude-opus-4-20250514` was released 2025-05-22, and `claude-haiku-4-5-20251001` was released 2025-10-15. Always corroborate against Anthropic's blog/press for the actual release date. Only `claude-sonnet-4-5-20250929` and `claude-opus-4-1-20250805` have suffixes that match.
|
||||
3. **Closed beta -> public beta -> GA**: use the first date *external* users could access the specific variant.
|
||||
4. **Family-headline IDs and dated snapshots** (e.g., `claude-opus-4-1` and `claude-opus-4-1-20250805`): typically share a release date.
|
||||
5. **Hosted on a third party** (Groq hosting Llama, OpenPipe mirroring others, OpenRouter aggregating): use the *underlying* model's original release date by its creator, not when the host added it.
|
||||
6. **Symlinks** (entries with `symLink:`): inherit the target's date.
|
||||
7. **Partial dates** (only month known): use the 1st of the month and tag as MEDIUM confidence in the editor's note.
|
||||
|
||||
|
||||
## Editorial control matrix
|
||||
|
||||
Three categories:
|
||||
|
||||
- **Editorial** - the vendor file contains hand-curated entries; we control descriptions, pricing, benchmarks, interfaces, parameter specs, and `pubDate`.
|
||||
- **Hybrid** - the API returns the live model list, and editorial entries (keyed by id/idPrefix) merge over the API data via `fromManualMapping`. We control everything except *which models exist*.
|
||||
- **Dynamic** - the API is the only source of model identity and metadata. Big-AGI cannot reliably populate `pubDate` here (no editorial knob).
|
||||
|
||||
| Vendor | Category | File | Array | Entries | `pubDate` populated |
|
||||
|---|---|---|---|---|---|
|
||||
| Anthropic | Hybrid | `anthropic/anthropic.models.ts` | `hardcodedAnthropicModels` | 12 | 12/12 HIGH |
|
||||
| Gemini | Hybrid | `gemini/gemini.models.ts` | `_knownGeminiModels` | 33 | 33/33 HIGH |
|
||||
| OpenAI | Hybrid | `openai/models/openai.models.ts` | `_knownOpenAIChatModels` | 96 | 95/96 HIGH/MED (`osb-120b` skipped, speculative) |
|
||||
| xAI | Hybrid | `openai/models/xai.models.ts` | `_knownXAIChatModels` | 13 | 13/13 HIGH (pilot) |
|
||||
| Mistral | Hybrid | `openai/models/mistral.models.ts` | `_knownMistralModelDetails` | 41 | 41/41 (40 HIGH, 1 MED for legacy `mistral-medium`) |
|
||||
| Moonshot (Kimi) | Hybrid | `openai/models/moonshot.models.ts` | `_knownMoonshotModels` | 13 | 13/13 (10 HIGH, 3 MED for v1 base models) |
|
||||
| Perplexity | Editorial | `openai/models/perplexity.models.ts` | `_knownPerplexityChatModels` | 4 | 4/4 HIGH |
|
||||
| MiniMax | Editorial | `openai/models/minimax.models.ts` | `_knownMiniMaxModels` | 10 | 10/10 HIGH |
|
||||
| DeepSeek | Hybrid | `openai/models/deepseek.models.ts` | `_knownDeepseekChatModels` | 4 | 4/4 HIGH |
|
||||
| Groq | Hybrid (host) | `openai/models/groq.models.ts` | `_knownGroqModels` | 11 | 11/11 HIGH (underlying-model date) |
|
||||
| Z.AI / GLM | Hybrid | `openai/models/zai.models.ts` | `_knownZAIModels` | 17 | 16/17 (`glm-5-code` UNCONFIRMED) |
|
||||
| OpenPipe | Editorial (mirror) | `openai/models/openpipe.models.ts` | `_knownOpenPipeChatModels` | 30 | 30/30 HIGH (all upstream-mirror, no OpenPipe originals) |
|
||||
| Bedrock | Reuses Anthropic | `bedrock/bedrock.models.ts` | -> `hardcodedAnthropicModels` | (12) | inherited |
|
||||
| Ollama | Editorial (catalog) | `ollama/ollama.models.ts` | `OLLAMA_BASE_MODELS` | 209 | **deferred** - see notes |
|
||||
| Arcee AI | Dynamic | `openai/models/arceeai.models.ts` | `_arceeKnownModels` | 0 | n/a (empty) |
|
||||
| LLMAPI | Dynamic | `openai/models/llmapi.models.ts` | `_llmapiKnownModels` | 0 | n/a (empty) |
|
||||
| Alibaba | Dynamic | `openai/models/alibaba.models.ts` | `_knownAlibabaChatModels` | 0 | n/a (empty) |
|
||||
| OpenRouter | Dynamic + delegated lookup | `openai/models/openrouter.models.ts` | (parser) | -- | inherited via `llmOrt*Lookup` |
|
||||
| TogetherAI | Dynamic | `openai/models/together.models.ts` | (parser) | -- | no |
|
||||
| FireworksAI | Dynamic | `openai/models/fireworksai.models.ts` | (parser) | -- | no |
|
||||
| Novita | Dynamic | `openai/models/novita.models.ts` | (parser) | -- | no |
|
||||
| ChutesAI | Dynamic | `openai/models/chutesai.models.ts` | (parser) | -- | no |
|
||||
| TLUS | Dynamic | `openai/models/tlusapi.models.ts` | (parser) | -- | no |
|
||||
| Azure | Dynamic | `openai/models/azure.models.ts` | (parser) | -- | no |
|
||||
| LM Studio | Dynamic | `openai/models/lmstudio.models.ts` | (parser) | -- | no |
|
||||
| LocalAI | Dynamic | `openai/models/localai.models.ts` | (parser) | -- | no |
|
||||
| FastAPI | Dynamic | `openai/models/fastapi.models.ts` | (parser) | -- | no |
|
||||
|
||||
**Totals**: 284 editorial entries across 12 vendors, of which **282** have corroborated `pubDate` and **2** are intentional gaps (`osb-120b` speculative, `glm-5-code` not yet announced). All 12 vendor files type-check clean.
|
||||
|
||||
### Notes
|
||||
|
||||
- **Hybrid** vendors are still effectively editorial for the models we know about: when an API id matches a hardcoded `idPrefix` (or `id`), `fromManualMapping` injects all the editorial fields. Unknown ids fall through to a default-shaped placeholder where `pubDate` is undefined.
|
||||
- **OpenRouter** delegates back to Anthropic / Gemini / OpenAI editorial lookups via `llmOrtAntLookup_ThinkingVariants`, `llmOrtGemLookup`, `llmOrtOaiLookup`. `pubDate` flows through these lookups, so OpenRouter-served Claude/Gemini/GPT models get `pubDate` automatically once the underlying editorial entry has it.
|
||||
- **Bedrock** finds Anthropic editorial via `llmBedrockFindAnthropicModel` and strips unsupported interfaces - `pubDate` inherits from Anthropic.
|
||||
- **Ollama** is deferred: 209 entries keyed by upstream model family (e.g. `qwen3.6`, `kimi-k2`, `glm-4.6`). Each entry's `pubDate` would need to be the upstream creator's release date (Meta, Alibaba, Moonshot, Z.AI, etc.). This is large-scale upstream research; better handled in a follow-up pass once cross-vendor `pubDate` data is consolidated and reusable.
|
||||
- **Dynamic-only** vendors get nothing automatic. To add `pubDate` for them we'd have to seed editorial entries (which is what `fromManualMapping`'s mapping mechanism was built for); this is a per-vendor decision and out of scope for the initial rollout.
|
||||
@@ -0,0 +1,78 @@
|
||||
# LLM Models Catalog Pipeline (forward-looking)
|
||||
|
||||
Status: **proposal / partially implemented**. Companion to [LLM-editorial-control.md](LLM-editorial-pubdate.md) which describes the durable reference (`pubDate` semantics, editorial-vs-dynamic matrix, propagation chain).
|
||||
|
||||
This document captures the forward-looking pipeline that turns Big-AGI's editorial model metadata into website value-add (plots, decision helpers, comparison tools at big-agi.com).
|
||||
|
||||
|
||||
## Goal
|
||||
|
||||
Stand up a database/datastore that the website (`~/dev/website`) can query for plots, decision helpers, and comparison tools - without requiring the website to call our authenticated tRPC endpoints.
|
||||
|
||||
|
||||
## Stages
|
||||
|
||||
### Stage 1: source of truth (in this repo) — DONE
|
||||
|
||||
Editorial files in `src/modules/llms/server/` remain the canonical source for:
|
||||
|
||||
- Identity: id, label, vendor
|
||||
- Capabilities: `interfaces`, `parameterSpecs`, `contextWindow`, `maxCompletionTokens`
|
||||
- Pricing: `chatPrice` (input / output / cache tiers)
|
||||
- Benchmarks: `benchmark.cbaElo` (Chat Bot Arena ELO)
|
||||
- Lifecycle: `pubDate`, `isLegacy`, `isPreview`, `hidden`, deprecation comments
|
||||
|
||||
Well-typed, version-controlled, reviewed - every model edit is a code change with diff history. 282 entries currently carry `pubDate` (see editorial-control matrix).
|
||||
|
||||
### Stage 2: extraction script — IN PROGRESS
|
||||
|
||||
A build-time script (e.g. `scripts/llms/export-models.ts`) that:
|
||||
|
||||
1. Loads every editorial vendor's model array.
|
||||
2. Normalizes per-vendor shapes (array vs Record, `id` vs `idPrefix`, `KnownLink` symlinks) to a single row format.
|
||||
3. Resolves symlinks (target's `pubDate` flows through).
|
||||
4. Writes a single JSON snapshot: `data/models-catalog.json` (one row per model, with vendor + the editorial fields above).
|
||||
|
||||
Open question: do we want this committed (gives the website a stable artifact / public URL) or built on-demand in CI? **Recommend committed snapshot** under `data/` so consumers get a stable URL.
|
||||
|
||||
### Stage 3: enrichment — NOT STARTED
|
||||
|
||||
The exported snapshot gets enriched with data we don't currently track in editorial files:
|
||||
|
||||
- **Knowledge cutoff** (proposed in `llms.types.next.ts:217` but never implemented; should be added to `ModelDescription_schema` as a follow-up).
|
||||
- **MMLU / HumanEval / SWE-bench / GPQA / MATH** scores (currently only `cbaElo`; richer benchmarks belong in a separate block).
|
||||
- **Throughput / latency** numbers (per-vendor, possibly per-region).
|
||||
- **Modalities matrix** (input image, input audio, input video, input PDF, output image, output audio).
|
||||
- **Weights availability** (closed / open / restricted), license.
|
||||
|
||||
Sources for enrichment: HuggingFace cards, vendor docs, Artificial Analysis, LLM-Stats, official benchmarks. Some can be scraped on a cadence; some needs editorial review.
|
||||
|
||||
### Stage 4: website consumption — NOT STARTED
|
||||
|
||||
The website (`~/dev/website`) consumes the snapshot to render:
|
||||
|
||||
- **Timeline plot**: `pubDate` (x-axis) vs `cbaElo` (y-axis), grouped by vendor - shows the frontier and rate of progress.
|
||||
- **Cost-per-quality plot**: `chatPrice.output` vs `cbaElo` - "best model per dollar".
|
||||
- **Decision helpers**: filter by capability (`interfaces`), context window, pricing tier, vendor.
|
||||
- **Comparison cards**: side-by-side specs.
|
||||
- **Lifecycle alerts**: deprecation warnings for retiring models.
|
||||
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Where does enrichment data live?** A separate `data/models-enrichment.json` (joined by id at build time) keeps editorial files clean but introduces a join surface. Alternative: extend `ModelDescription_schema` with optional enrichment fields and treat editorial files as the only source. Recommend the separate file approach - editorial files stay focused on vendor-API integration; enrichment evolves on a different cadence.
|
||||
2. **How fresh does the website need to be?** If daily, build the snapshot in CI on push and publish to a static URL. If real-time, consume tRPC directly - more work but fewer freshness gaps.
|
||||
3. **Do we expose `pubDate` and other editorial metadata via tRPC publicly, or only via the snapshot?** The current tRPC routes require auth; the website should consume the snapshot, not live tRPC.
|
||||
4. **Schema versioning** - if `ModelDescription_schema` evolves, the snapshot consumers need to be tolerant. Include a `schemaVersion` field in the snapshot envelope.
|
||||
|
||||
|
||||
## Future extensions to `ModelDescription_schema`
|
||||
|
||||
Beyond `pubDate`, the natural follow-ups (in priority order):
|
||||
|
||||
1. **`knowledgeCutoff?: string`** (`'YYYY-MM'` or `'YYYY-MM-DD'`) - already proposed in `llms.types.next.ts`. Useful for the timeline plot and for context-aware prompts.
|
||||
2. **`deprecationDate?: string`** - currently exists informally as `deprecated?: string` on `_knownGeminiModels`; should be promoted to the schema.
|
||||
3. **`license?: string`** - especially important for open-weights models (apache-2.0, mit, llama-community, custom).
|
||||
4. **`weights?: 'closed' | 'open' | 'restricted'`** - quick filter for "can I run this myself?".
|
||||
5. **`benchmarks?: { mmlu?: number, humaneval?: number, gpqa?: number, ... }`** - richer than the current `cbaElo`-only block.
|
||||
6. **`modalities?: { in: string[], out: string[] }`** - more precise than `interfaces` for input/output capability matrices.
|
||||
@@ -25,6 +25,7 @@ export interface DLLM {
|
||||
label: string;
|
||||
created: number | 0;
|
||||
updated?: number | 0;
|
||||
pubDate?: string; // official release date in 'YYYYMMDD'
|
||||
description: string;
|
||||
hidden: boolean;
|
||||
|
||||
@@ -137,6 +138,20 @@ export function getLLMMaxOutputTokens(llm: DLLM | null): DLLMMaxOutputTokens | u
|
||||
return llm.userMaxOutputTokens ?? llm.maxOutputTokens;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse the model's editorial `pubDate` ('YYYYMMDD') into a Date, or null if missing/malformed.
|
||||
* Date is constructed at local midnight - pubDate is day-precision, no time component.
|
||||
*/
|
||||
export function getLLMPubDate(llm: DLLM | null | undefined): Date | null {
|
||||
const p = llm?.pubDate;
|
||||
if (!p || !/^\d{8}$/.test(p)) return null;
|
||||
const y = parseInt(p.slice(0, 4), 10);
|
||||
const m = parseInt(p.slice(4, 6), 10) - 1; // JS Date months are 0-indexed
|
||||
const d = parseInt(p.slice(6, 8), 10);
|
||||
const date = new Date(y, m, d);
|
||||
return Number.isFinite(date.getTime()) ? date : null;
|
||||
}
|
||||
|
||||
/// Interfaces ///
|
||||
|
||||
// do not change anything below! those will be persisted in data
|
||||
|
||||
@@ -107,6 +107,7 @@ function _createDLLMFromModelDescription(d: ModelDescriptionSchema, service: DMo
|
||||
label: d.label,
|
||||
created: d.created || 0,
|
||||
updated: d.updated || 0,
|
||||
...(d.pubDate && { pubDate: d.pubDate }),
|
||||
description: d.description,
|
||||
hidden: !!d.hidden,
|
||||
|
||||
|
||||
@@ -15,7 +15,7 @@ import WarningRoundedIcon from '@mui/icons-material/WarningRounded';
|
||||
|
||||
import { type DPricingChatGenerate, isLLMChatFree_cached, llmChatPricing_adjusted } from '~/common/stores/llms/llms.pricing';
|
||||
import type { ModelOptionsContext } from '~/common/layout/optima/store-layout-optima';
|
||||
import { DLLMId, DModelInterfaceV1, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, isLLMVisible, LLM_IF_HOTFIX_NoStream, LLM_IF_HOTFIX_NoTemperature, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';
|
||||
import { DLLMId, DModelInterfaceV1, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, getLLMPubDate, isLLMVisible, LLM_IF_HOTFIX_NoStream, LLM_IF_HOTFIX_NoTemperature, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';
|
||||
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
|
||||
import { GoodModal } from '~/common/components/modals/GoodModal';
|
||||
import { LLMImplicitParametersRuntimeFallback } from '~/common/stores/llms/llms.parameters';
|
||||
@@ -280,6 +280,7 @@ export function LLMOptionsModal(props: { id: DLLMId, context?: ModelOptionsConte
|
||||
|
||||
// cache
|
||||
const adjChatPricing = llmChatPricing_adjusted(llm);
|
||||
const pubDate = getLLMPubDate(llm);
|
||||
|
||||
|
||||
return (
|
||||
@@ -502,7 +503,8 @@ export function LLMOptionsModal(props: { id: DLLMId, context?: ModelOptionsConte
|
||||
id: {llm.id}<br />
|
||||
context: <b>{getLLMContextTokens(llm)?.toLocaleString() ?? 'not provided'}</b> tokens{` · `}
|
||||
max output: <b>{getLLMMaxOutputTokens(llm)?.toLocaleString() ?? 'not provided'}</b><br />
|
||||
{!!llm.created && <>created: <TimeAgo date={new Date(llm.created * 1000)} /><br /></>}
|
||||
{!!pubDate && <>published: <b>{pubDate.toLocaleDateString(undefined, { year: 'numeric', month: 'short', day: 'numeric' })}</b> · <TimeAgo date={pubDate} /><br /></>}
|
||||
{!!llm.created && <>indexed: <TimeAgo date={new Date(llm.created * 1000)} /><br /></>}
|
||||
{/*· tags: {llm.tags.join(', ')}*/}
|
||||
{!!adjChatPricing && prettyPricingComponent(adjChatPricing)}
|
||||
{/*{!!llm.benchmark && <>benchmark: <b>{llm.benchmark.cbaElo?.toLocaleString() || '(unk) '}</b> CBA Elo<br /></>}*/}
|
||||
|
||||
@@ -9,7 +9,7 @@ import VisibilityOutlinedIcon from '@mui/icons-material/VisibilityOutlined';
|
||||
|
||||
import type { DModelsServiceId } from '~/common/stores/llms/llms.service.types';
|
||||
import { isLLMChatFree_cached } from '~/common/stores/llms/llms.pricing';
|
||||
import { DLLM, DLLMId, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, isLLMCustomUserParameters, isLLMHidden, LLM_IF_ANT_PromptCaching, LLM_IF_GEM_CodeExecution, LLM_IF_OAI_Fn, LLM_IF_OAI_Json, LLM_IF_OAI_PromptCaching, LLM_IF_OAI_Reasoning, LLM_IF_OAI_Vision, LLM_IF_Outputs_Audio, LLM_IF_Outputs_Image, LLM_IF_Tools_WebSearch } from '~/common/stores/llms/llms.types';
|
||||
import { DLLM, DLLMId, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, getLLMPubDate, isLLMCustomUserParameters, isLLMHidden, LLM_IF_ANT_PromptCaching, LLM_IF_GEM_CodeExecution, LLM_IF_OAI_Fn, LLM_IF_OAI_Json, LLM_IF_OAI_PromptCaching, LLM_IF_OAI_Reasoning, LLM_IF_OAI_Vision, LLM_IF_Outputs_Audio, LLM_IF_Outputs_Image, LLM_IF_Tools_WebSearch } from '~/common/stores/llms/llms.types';
|
||||
import { GoodTooltip } from '~/common/components/GoodTooltip';
|
||||
import { PhGearSixIcon } from '~/common/components/icons/phosphor/PhGearSixIcon';
|
||||
import { STAR_EMOJI, StarredToggle, starredToggleStyle } from '~/common/components/StarIcons';
|
||||
@@ -99,6 +99,10 @@ export const ModelItem = React.memo(function ModelItem(props: {
|
||||
const isNotSymlink = !llm.label.startsWith('🔗'); // getLLMLabel exception: need access to the base
|
||||
const llmLabel = getLLMLabel(llm);
|
||||
|
||||
// "new" badge: shown only when pubDate is set AND within the last 30 days
|
||||
const pubDate = getLLMPubDate(llm);
|
||||
const isRecentlyPublished = pubDate ? (Date.now() - pubDate.getTime()) < 30 * 24 * 60 * 60 * 1000 : false;
|
||||
|
||||
|
||||
const handleLLMConfigure = React.useCallback((event: React.MouseEvent) => {
|
||||
event.stopPropagation();
|
||||
@@ -227,6 +231,7 @@ export const ModelItem = React.memo(function ModelItem(props: {
|
||||
</>}
|
||||
|
||||
{/* Features Chips - sync with `useLLMSelect.tsx` */}
|
||||
{isRecentlyPublished && isNotSymlink && pubDate && <GoodTooltip title={`Released ${pubDate.toLocaleDateString(undefined, { year: 'numeric', month: 'short', day: 'numeric' })}`}><Chip size='sm' variant='solid' sx={isHidden ? styles.chipDisabled : { bgcolor: '#d4ff3a', color: 'black', fontWeight: 'lg' }}>new</Chip></GoodTooltip>}
|
||||
{featuresChipMemo}
|
||||
{seemsFree && isNotSymlink && <Chip size='sm' color='success' variant='plain' sx={isHidden ? styles.chipDisabled : styles.chipFree}>free</Chip>}
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ import { Release } from '~/common/app.release';
|
||||
|
||||
import type { ModelDescriptionSchema, OrtVendorLookupResult } from '../llm.server.types';
|
||||
import { createVariantInjector, ModelVariantMap } from '../llm.server.variants';
|
||||
import { llmDevCheckModels_DEV } from '../models.mappings';
|
||||
import { formatPubDate, llmDevCheckModels_DEV } from '../models.mappings';
|
||||
|
||||
|
||||
// Note: these model definitions are shared across Anthropic API, OpenRouter, and AWS Bedrock.
|
||||
@@ -214,12 +214,13 @@ export function llmsAntInjectVariants(acc: ModelDescriptionSchema[], model: Mode
|
||||
}
|
||||
|
||||
|
||||
export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: boolean })[] = [
|
||||
export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: boolean, pubDate: string /* make it required for the defs */ })[] = [
|
||||
|
||||
// Claude 4.7 models
|
||||
{
|
||||
id: 'claude-opus-4-7', // Active - 2026-04-16
|
||||
label: 'Claude Opus 4.7',
|
||||
pubDate: '20260416',
|
||||
description: 'Most capable generally available model for complex reasoning and agentic coding',
|
||||
contextWindow: 1_000_000, // 1M GA at standard pricing (no opt-in required)
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -239,6 +240,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-opus-4-6', // Active
|
||||
label: 'Claude Opus 4.6',
|
||||
pubDate: '20260205',
|
||||
description: 'Previous most intelligent model for complex agents and coding, with adaptive thinking',
|
||||
contextWindow: 1_000_000, // 1M GA at standard pricing since 2026-03-13 (no opt-in required)
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -255,6 +257,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-sonnet-4-6', // Active
|
||||
label: 'Claude Sonnet 4.6',
|
||||
pubDate: '20260217',
|
||||
description: 'Best combination of speed and intelligence for everyday tasks',
|
||||
contextWindow: 1_000_000, // 1M GA at standard pricing since 2026-03-13 (no opt-in required)
|
||||
maxCompletionTokens: 128000, // docs say 64000, API reports 128000
|
||||
@@ -272,6 +275,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-opus-4-5-20251101', // Active
|
||||
label: 'Claude Opus 4.5',
|
||||
pubDate: '20251124',
|
||||
description: 'Previous most intelligent model with advanced reasoning for complex agentic workflows',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 64000,
|
||||
@@ -286,6 +290,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-sonnet-4-5-20250929', // Active
|
||||
label: 'Claude Sonnet 4.5',
|
||||
pubDate: '20250929',
|
||||
description: 'Previous best combination of speed and intelligence for complex agents and coding',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 64000,
|
||||
@@ -311,6 +316,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-haiku-4-5-20251001', // Active
|
||||
label: 'Claude Haiku 4.5',
|
||||
pubDate: '20251015',
|
||||
description: 'Fastest model with exceptional speed and performance',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 64000,
|
||||
@@ -324,6 +330,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-opus-4-1-20250805', // Active
|
||||
label: 'Claude Opus 4.1',
|
||||
pubDate: '20250805',
|
||||
description: 'Exceptional model for specialized complex tasks requiring advanced reasoning',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 32000,
|
||||
@@ -338,6 +345,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
hidden: true, // Deprecated: April 14, 2026 | Retiring: June 15, 2026 | Replacement: claude-opus-4-7
|
||||
id: 'claude-opus-4-20250514', // Deprecated
|
||||
label: 'Claude Opus 4 [Deprecated]',
|
||||
pubDate: '20250522',
|
||||
description: 'Previous flagship model. Deprecated April 14, 2026, retiring June 15, 2026.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 32000,
|
||||
@@ -351,6 +359,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
hidden: true, // Deprecated: April 14, 2026 | Retiring: June 15, 2026 | Replacement: claude-sonnet-4-6
|
||||
id: 'claude-sonnet-4-20250514', // Deprecated
|
||||
label: 'Claude Sonnet 4 [Deprecated]',
|
||||
pubDate: '20250522',
|
||||
description: 'High-performance model. Deprecated April 14, 2026, retiring June 15, 2026.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 64000,
|
||||
@@ -379,6 +388,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-3-7-sonnet-20250219', // Retired | Deprecated: October 28, 2025 | Retired: February 19, 2026 | Replacement: claude-opus-4-6
|
||||
label: 'Claude Sonnet 3.7 [Retired]',
|
||||
pubDate: '20250224',
|
||||
description: 'High-performance model with early extended thinking. Retired February 19, 2026.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 64000,
|
||||
@@ -396,6 +406,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
{
|
||||
id: 'claude-3-5-haiku-20241022', // Retired | Deprecated: December 19, 2025 | Retired: February 19, 2026
|
||||
label: 'Claude Haiku 3.5 [Retired]',
|
||||
pubDate: '20241104',
|
||||
description: 'Intelligence at blazing speeds. Retired February 19, 2026.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 8192,
|
||||
@@ -413,6 +424,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
|
||||
hidden: true, // deprecated
|
||||
id: 'claude-3-haiku-20240307', // Deprecated | Deprecated: February 19, 2026 | Retiring: April 20, 2026 | Replacement: claude-haiku-4-5-20251001
|
||||
label: 'Claude Haiku 3 [Deprecated]',
|
||||
pubDate: '20240313',
|
||||
description: 'Fast and compact model for near-instant responsiveness. Deprecated February 19, 2026, retiring April 20, 2026.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 4096,
|
||||
@@ -595,11 +607,13 @@ export function llmsAntCreatePlaceholderModel(model: AnthropicWire_API_Models_Li
|
||||
parameterSpecs.push(...ANT_TOOLS);
|
||||
|
||||
const maxInputTokens = model.max_input_tokens;
|
||||
const createdAt = model.created_at ? new Date(model.created_at) : undefined;
|
||||
return {
|
||||
id: model.id,
|
||||
idVariant: '::placeholder',
|
||||
label: model.display_name,
|
||||
created: Math.round(new Date(model.created_at).getTime() / 1000),
|
||||
created: createdAt ? Math.round(createdAt.getTime() / 1000) : undefined,
|
||||
pubDate: formatPubDate(createdAt), // 0-day: use Anthropic API's created_at, or today if unset
|
||||
description: 'Newest model, description not available yet.',
|
||||
contextWindow: maxInputTokens ?? 200_000, // report API value as-is (no cap for unknown models)
|
||||
maxCompletionTokens: model.max_tokens || 32768,
|
||||
@@ -755,5 +769,5 @@ export function llmOrtAntLookup_ThinkingVariants(orModelName: string): OrtVendor
|
||||
.map((spec) => ({ ...spec }));
|
||||
|
||||
// initialTemperature: not set - Anthropic models use the global fallback (0.5)
|
||||
return { interfaces, parameterSpecs };
|
||||
return { pubDate: model.pubDate, interfaces, parameterSpecs };
|
||||
}
|
||||
|
||||
@@ -6,7 +6,7 @@ import { Release } from '~/common/app.release';
|
||||
|
||||
import type { ModelDescriptionSchema, OrtVendorLookupResult } from '../llm.server.types';
|
||||
import { createVariantInjector, ModelVariantMap } from '../llm.server.variants';
|
||||
import { llmDevCheckModels_DEV } from '../models.mappings';
|
||||
import { formatPubDate, llmDevCheckModels_DEV } from '../models.mappings';
|
||||
|
||||
|
||||
// dev options
|
||||
@@ -186,7 +186,7 @@ const _knownGeminiModels: ({
|
||||
symLink?: string,
|
||||
deprecated?: string, // Gemini may provide deprecation dates
|
||||
// _delete removed - models are now physically removed from the list instead of marked for deletion
|
||||
} & Pick<ModelDescriptionSchema, 'interfaces' | 'parameterSpecs' | 'chatPrice' | 'hidden' | 'benchmark'>)[] = [
|
||||
} & Pick<ModelDescriptionSchema, 'pubDate' | 'interfaces' | 'parameterSpecs' | 'chatPrice' | 'hidden' | 'benchmark'> & { pubDate: string /* make it required */})[] = [
|
||||
|
||||
/// Generation 3.1
|
||||
|
||||
@@ -195,6 +195,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-3.1-pro-preview',
|
||||
labelOverride: 'Gemini 3.1 Pro Preview',
|
||||
pubDate: '20260219',
|
||||
isPreview: true,
|
||||
chatPrice: gemini30ProPricing, // same pricing as 3 Pro
|
||||
interfaces: IF_30,
|
||||
@@ -213,6 +214,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // specialized variant for custom tool prioritization
|
||||
id: 'models/gemini-3.1-pro-preview-customtools',
|
||||
labelOverride: 'Gemini 3.1 Pro Preview (Custom Tools)',
|
||||
pubDate: '20260219',
|
||||
isPreview: true,
|
||||
chatPrice: gemini30ProPricing,
|
||||
interfaces: IF_30,
|
||||
@@ -230,6 +232,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-3.1-flash-image-preview',
|
||||
labelOverride: 'Nano Banana 2',
|
||||
pubDate: '20260226',
|
||||
isPreview: true,
|
||||
chatPrice: gemini31FlashImagePricing,
|
||||
interfaces: IF_30,
|
||||
@@ -247,6 +250,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-3.1-flash-lite-preview',
|
||||
labelOverride: 'Gemini 3.1 Flash-Lite Preview',
|
||||
pubDate: '20260303',
|
||||
isPreview: true,
|
||||
chatPrice: gemini31FlashLitePricing,
|
||||
interfaces: IF_30,
|
||||
@@ -268,6 +272,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // March 9, 2026: API silently routes 'gemini-3-pro-preview' to 'gemini-3.1-pro-preview' - hide to prevent user confusion
|
||||
id: 'models/gemini-3-pro-preview',
|
||||
labelOverride: 'Gemini 3 Pro Preview',
|
||||
pubDate: '20251118',
|
||||
isPreview: true,
|
||||
deprecated: '2026-03-09',
|
||||
chatPrice: gemini30ProPricing,
|
||||
@@ -286,6 +291,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-3-pro-image-preview',
|
||||
labelOverride: 'Nano Banana Pro', // Marketing name for the technical model ID
|
||||
pubDate: '20251120',
|
||||
isPreview: true,
|
||||
chatPrice: gemini30ProImagePricing,
|
||||
interfaces: IF_30,
|
||||
@@ -301,6 +307,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/nano-banana-pro-preview',
|
||||
labelOverride: 'Nano Banana Pro',
|
||||
pubDate: '20251120',
|
||||
symLink: 'models/gemini-3-pro-image-preview',
|
||||
// copied from symlink
|
||||
isPreview: true,
|
||||
@@ -320,6 +327,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-3-flash-preview',
|
||||
labelOverride: 'Gemini 3 Flash Preview',
|
||||
pubDate: '20251217',
|
||||
isPreview: true,
|
||||
chatPrice: gemini30FlashPricing,
|
||||
interfaces: IF_30,
|
||||
@@ -340,6 +348,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // outperformed by 3.1 Pro (1493) and even 3 Flash (1474) - deprecated in 2 months
|
||||
id: 'models/gemini-2.5-pro',
|
||||
labelOverride: 'Gemini 2.5 Pro',
|
||||
pubDate: '20250617',
|
||||
deprecated: '2026-06-17',
|
||||
chatPrice: gemini25ProPricing,
|
||||
interfaces: IF_25,
|
||||
@@ -362,6 +371,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // single-turn-only model - unhide and just send a message to make use of this
|
||||
id: 'models/gemini-2.5-pro-preview-tts',
|
||||
pubDate: '20250520',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25ProPreviewTTSPricing,
|
||||
interfaces: [
|
||||
@@ -379,6 +389,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/deep-research-preview-04-2026',
|
||||
labelOverride: 'Deep Research Preview (2026-04)',
|
||||
pubDate: '20260421',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25ProPricing, // pricing not explicitly listed; using 2.5 Pro as baseline
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
|
||||
@@ -391,6 +402,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/deep-research-max-preview-04-2026',
|
||||
labelOverride: 'Deep Research Max Preview (2026-04)',
|
||||
pubDate: '20260421',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25ProPricing, // baseline estimate (see note above)
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
|
||||
@@ -398,11 +410,12 @@ const _knownGeminiModels: ({
|
||||
benchmark: undefined, // Deep research model, not benchmarkable on standard tests
|
||||
},
|
||||
|
||||
// Deep Research Pro Preview - Released December 12, 2025
|
||||
// Deep Research Pro Preview - Released December 11, 2025
|
||||
{
|
||||
hidden: true, // yield to newer 2026-04 models
|
||||
id: 'models/deep-research-pro-preview-12-2025',
|
||||
labelOverride: 'Deep Research Pro Preview',
|
||||
pubDate: '20251211',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25ProPricing,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
|
||||
@@ -418,6 +431,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // outperformed by 3 Flash Preview (1474 vs 1411) - deprecated in 2 months
|
||||
id: 'models/gemini-2.5-flash',
|
||||
labelOverride: 'Gemini 2.5 Flash',
|
||||
pubDate: '20250617',
|
||||
deprecated: '2026-06-17',
|
||||
chatPrice: gemini25FlashPricing,
|
||||
interfaces: IF_25,
|
||||
@@ -445,6 +459,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-2.5-computer-use-preview-10-2025',
|
||||
labelOverride: 'Gemini 2.5 Computer Use Preview 10-2025',
|
||||
pubDate: '20251007',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25ProPricing, // Uses same pricing as 2.5 Pro (pricing page doesn't list separately)
|
||||
// NOTE: sweep shows fn=['auto'] only (no 'roundtrip') - partial Fn capability, do not advertise LLM_IF_OAI_Fn
|
||||
@@ -462,6 +477,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-robotics-er-1.6-preview',
|
||||
labelOverride: 'Gemini Robotics-ER 1.6 Preview',
|
||||
pubDate: '20260414',
|
||||
isPreview: true,
|
||||
chatPrice: geminiRoboticsER16Pricing,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Reasoning],
|
||||
@@ -474,6 +490,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // superseded by Robotics-ER 1.6 - shutdown April 30, 2026
|
||||
id: 'models/gemini-robotics-er-1.5-preview',
|
||||
labelOverride: 'Gemini Robotics-ER 1.5 Preview',
|
||||
pubDate: '20250925',
|
||||
isPreview: true,
|
||||
deprecated: '2026-04-30',
|
||||
chatPrice: gemini25FlashPricing, // Uses same pricing as 2.5 Flash per pricing page
|
||||
@@ -486,6 +503,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-2.5-flash-image',
|
||||
labelOverride: 'Nano Banana',
|
||||
pubDate: '20251002',
|
||||
deprecated: '2026-10-02',
|
||||
chatPrice: { input: 0.30, output: undefined }, // Per pricing page: $0.30 text/image input, $0.039 per image output, but the text output is not stated
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -506,6 +524,7 @@ const _knownGeminiModels: ({
|
||||
hidden: true, // audio outputs are unavailable
|
||||
id: 'models/gemini-3.1-flash-tts-preview',
|
||||
labelOverride: 'Gemini 3.1 Flash TTS Preview',
|
||||
pubDate: '20260415',
|
||||
isPreview: true,
|
||||
chatPrice: gemini31FlashTTSPricing,
|
||||
interfaces: [
|
||||
@@ -521,6 +540,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // audio outputs are unavailable as of 2025-05-27
|
||||
id: 'models/gemini-2.5-flash-preview-tts',
|
||||
pubDate: '20250520',
|
||||
isPreview: true,
|
||||
chatPrice: gemini25FlashPreviewTTSPricing,
|
||||
interfaces: [
|
||||
@@ -548,6 +568,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
id: 'models/gemini-2.5-flash-lite',
|
||||
labelOverride: 'Gemini 2.5 Flash-Lite',
|
||||
pubDate: '20250722',
|
||||
deprecated: '2026-07-22',
|
||||
chatPrice: gemini25FlashLitePricing,
|
||||
interfaces: IF_25,
|
||||
@@ -580,6 +601,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // outclassed by all Flash models in 2.5/3.x series - shutdown in ~5 weeks
|
||||
id: 'models/gemini-2.0-flash-001',
|
||||
pubDate: '20250205',
|
||||
deprecated: '2026-06-01',
|
||||
chatPrice: gemini20FlashPricing,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_GEM_CodeExecution],
|
||||
@@ -588,6 +610,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // outclassed by all Flash models in 2.5/3.x series - shutdown in ~5 weeks
|
||||
id: 'models/gemini-2.0-flash',
|
||||
pubDate: '20250205',
|
||||
symLink: 'models/gemini-2.0-flash-001',
|
||||
deprecated: '2026-06-01',
|
||||
// copied from symlink
|
||||
@@ -600,6 +623,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // outclassed by 2.5/3.1 Flash-Lite - shutdown in ~5 weeks
|
||||
id: 'models/gemini-2.0-flash-lite',
|
||||
pubDate: '20250225',
|
||||
chatPrice: gemini20FlashLitePricing,
|
||||
symLink: 'models/gemini-2.0-flash-lite-001',
|
||||
deprecated: '2026-06-01',
|
||||
@@ -609,6 +633,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // outclassed by 2.5/3.1 Flash-Lite - shutdown in ~5 weeks
|
||||
id: 'models/gemini-2.0-flash-lite-001',
|
||||
pubDate: '20250225',
|
||||
chatPrice: gemini20FlashLitePricing,
|
||||
deprecated: '2026-06-01',
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
|
||||
@@ -648,6 +673,7 @@ const _knownGeminiModels: ({
|
||||
// Gemma 4 Models - Released April 2, 2026
|
||||
{
|
||||
id: 'models/gemma-4-31b-it',
|
||||
pubDate: '20260402',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
parameterSpecs: [{ paramId: 'llmVndGemEffort', enumValues: ['minimal', 'high'] }],
|
||||
@@ -657,6 +683,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // smaller MoE variant
|
||||
id: 'models/gemma-4-26b-a4b-it',
|
||||
pubDate: '20260402',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
parameterSpecs: [{ paramId: 'llmVndGemEffort', enumValues: ['minimal', 'high'] }],
|
||||
@@ -667,6 +694,7 @@ const _knownGeminiModels: ({
|
||||
// Gemma 3n Model (newer than 3, first seen on the May 2025 update)
|
||||
{
|
||||
id: 'models/gemma-3n-e4b-it',
|
||||
pubDate: '20250626',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree, // Free tier only according to pricing page
|
||||
@@ -674,6 +702,7 @@ const _knownGeminiModels: ({
|
||||
},
|
||||
{
|
||||
id: 'models/gemma-3n-e2b-it',
|
||||
pubDate: '20250626',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree, // Free tier only according to pricing page
|
||||
@@ -685,6 +714,7 @@ const _knownGeminiModels: ({
|
||||
// - LLM_IF_HOTFIX_Sys0ToUsr0, because: "Developer instruction is not enabled for models/gemma-3-27b-it"
|
||||
{
|
||||
id: 'models/gemma-3-27b-it',
|
||||
pubDate: '20250312',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree, // Pricing page indicates free tier only
|
||||
@@ -694,6 +724,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // keep larger model
|
||||
id: 'models/gemma-3-12b-it',
|
||||
pubDate: '20250312',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree,
|
||||
@@ -702,6 +733,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // keep larger model
|
||||
id: 'models/gemma-3-4b-it',
|
||||
pubDate: '20250312',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree,
|
||||
@@ -710,6 +742,7 @@ const _knownGeminiModels: ({
|
||||
{
|
||||
hidden: true, // keep larger model
|
||||
id: 'models/gemma-3-1b-it',
|
||||
pubDate: '20250312',
|
||||
isPreview: true,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
|
||||
chatPrice: geminiExpFree,
|
||||
@@ -948,6 +981,7 @@ export function geminiModelToModelDescription(geminiModel: GeminiWire_API_Models
|
||||
label: label,
|
||||
// created: ...
|
||||
// updated: ...
|
||||
pubDate: knownModel?.pubDate ?? formatPubDate(), // 0-day fallback; the editorial entry is the source of truth; today's date is a placeholder until editorial catches up
|
||||
description: descriptionLong,
|
||||
contextWindow: contextWindow,
|
||||
maxCompletionTokens: outputTokenLimit,
|
||||
@@ -1035,5 +1069,5 @@ export function llmOrtGemLookup(orModelName: string): OrtVendorLookupResult | un
|
||||
?.filter(spec => _ORT_GEM_PARAM_ALLOWLIST.has(spec.paramId))
|
||||
.map(spec => ({ ...spec }));
|
||||
|
||||
return { interfaces, parameterSpecs, initialTemperature: GEMINI_DEFAULT_TEMPERATURE };
|
||||
return { pubDate: knownModel.pubDate, interfaces, parameterSpecs, initialTemperature: GEMINI_DEFAULT_TEMPERATURE };
|
||||
}
|
||||
|
||||
@@ -137,6 +137,7 @@ export const ModelDescription_schema = z.object({
|
||||
label: z.string(),
|
||||
created: z.int().optional(),
|
||||
updated: z.int().optional(),
|
||||
pubDate: z.string().regex(/^\d{8}$/).optional(), // editorial: model's official public release date 'YYYYMMDD'. Required for editorial entries (KnownModelEditorial) and for 0-day-fillable paths (Anthropic placeholder, Gemini unknown-model fallback). Omitted for dynamic-only vendors and unknown variants where we have no reliable signal.
|
||||
description: z.string(),
|
||||
contextWindow: z.int().nullable(),
|
||||
interfaces: z.array(z.enum(LLMS_ALL_INTERFACES).or(z.string())), // backward compatibility: to not Break client-side interface parsing on newer server
|
||||
@@ -155,6 +156,7 @@ export const ModelDescription_schema = z.object({
|
||||
// Each vendor's lookup filters to only what works through OpenRouter's OAI-compatible API.
|
||||
// OpenRouter merges these with its own auto-detected interfaces and params.
|
||||
export type OrtVendorLookupResult = {
|
||||
pubDate?: ModelDescriptionSchema['pubDate'];
|
||||
interfaces?: ModelDescriptionSchema['interfaces'];
|
||||
parameterSpecs?: ModelDescriptionSchema['parameterSpecs'];
|
||||
initialTemperature?: number; // vendor-specific default (e.g. Gemini 1.0); undefined = use global fallback (0.5)
|
||||
|
||||
@@ -111,6 +111,28 @@ export function llmDevValidateParameterSpecs_DEV(model: ModelDescriptionSchema):
|
||||
}
|
||||
|
||||
|
||||
// -- pubDate helpers --
|
||||
|
||||
/**
|
||||
* Format an epoch / Date / nothing as 'YYYYMMDD'.
|
||||
* Accepts either a Unix epoch (seconds), a Date, or undefined (-> today).
|
||||
*/
|
||||
export function formatPubDate(input?: number | Date): string {
|
||||
let date: Date;
|
||||
if (input instanceof Date && Number.isFinite(input.getTime()))
|
||||
date = input;
|
||||
else if (typeof input === 'number' && Number.isFinite(input) && input > 0) {
|
||||
const candidate = new Date(input * 1000);
|
||||
date = Number.isFinite(candidate.getTime()) ? candidate : new Date();
|
||||
} else
|
||||
date = new Date();
|
||||
const y = date.getUTCFullYear();
|
||||
const m = String(date.getUTCMonth() + 1).padStart(2, '0');
|
||||
const d = String(date.getUTCDate()).padStart(2, '0');
|
||||
return `${y}${m}${d}`;
|
||||
}
|
||||
|
||||
|
||||
// -- Manual model mappings: types and helper --
|
||||
|
||||
export type ManualMappings = (KnownModel | KnownLink)[];
|
||||
@@ -224,6 +246,7 @@ export function fromManualMapping(mappings: (KnownModel | KnownLink)[], upstream
|
||||
};
|
||||
|
||||
// apply optional fields
|
||||
if (m.pubDate) md.pubDate = m.pubDate;
|
||||
if (m.parameterSpecs) md.parameterSpecs = m.parameterSpecs;
|
||||
if (m.maxCompletionTokens) md.maxCompletionTokens = m.maxCompletionTokens;
|
||||
if (m.benchmark) md.benchmark = m.benchmark;
|
||||
|
||||
@@ -20,6 +20,7 @@ const _knownDeepseekChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'deepseek-v4-pro',
|
||||
label: 'DeepSeek V4 Pro',
|
||||
pubDate: '20260424',
|
||||
description: 'Premium reasoning model with 1M context. Supports extended thinking modes, JSON output, and function calling.',
|
||||
contextWindow: 1_048_576, // 1M
|
||||
interfaces: [...IF_4, LLM_IF_OAI_Reasoning],
|
||||
@@ -33,6 +34,7 @@ const _knownDeepseekChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'deepseek-v4-flash',
|
||||
label: 'DeepSeek V4 Flash',
|
||||
pubDate: '20260424',
|
||||
description: 'Fast general-purpose model with 1M context. Supports extended thinking modes, JSON output, and function calling.',
|
||||
contextWindow: 1_048_576, // 1M
|
||||
interfaces: [...IF_4, LLM_IF_OAI_Reasoning],
|
||||
|
||||
@@ -23,6 +23,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
isPreview: true,
|
||||
idPrefix: 'meta-llama/llama-4-scout-17b-16e-instruct',
|
||||
label: 'Llama 4 Scout · 17B × 16E (Preview)',
|
||||
pubDate: '20250405',
|
||||
description: 'Llama 4 Scout 17B MoE with 16 experts (109B total params), native multimodal with vision support. 131K context, 8K max output. ~750 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 8192,
|
||||
@@ -33,6 +34,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
isPreview: true,
|
||||
idPrefix: 'qwen/qwen3-32b',
|
||||
label: 'Qwen 3 · 32B (Preview)',
|
||||
pubDate: '20250428',
|
||||
description: 'Qwen3 32B by Alibaba Cloud. Supports thinking/non-thinking modes, 100+ languages. 131K context, 40K max output. ~400 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 40960,
|
||||
@@ -43,6 +45,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
isPreview: true,
|
||||
idPrefix: 'moonshotai/kimi-k2-instruct-0905',
|
||||
label: 'Kimi K2 Instruct 0905 (Preview)',
|
||||
pubDate: '20250905',
|
||||
description: 'Kimi K2 1T MoE model (32B active, 384 experts). Advanced agentic coding. 262K context, 16K max output. ~200 t/s on Groq.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -53,6 +56,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshotai/kimi-k2-instruct',
|
||||
label: 'Kimi K2 Instruct (Deprecated)',
|
||||
pubDate: '20250711',
|
||||
symLink: 'moonshotai/kimi-k2-instruct-0905',
|
||||
contextWindow: 131072, // API returns 131K (vs 262K for the 0905 version)
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -69,6 +73,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'groq/compound',
|
||||
label: 'Compound (Agentic System)',
|
||||
pubDate: '20250904',
|
||||
description: 'Groq agentic AI with web search, code execution, browser automation. Uses GPT-OSS 120B, Llama 4 Scout, Llama 3.3 70B. Pricing based on underlying model usage.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 8192,
|
||||
@@ -78,6 +83,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'groq/compound-mini',
|
||||
label: 'Compound Mini (Agentic System)',
|
||||
pubDate: '20250904',
|
||||
description: 'Lighter Groq agentic AI with web search, code execution. Pricing based on underlying model usage.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 8192,
|
||||
@@ -89,6 +95,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'openai/gpt-oss-120b',
|
||||
label: 'GPT OSS 120B',
|
||||
pubDate: '20250805',
|
||||
description: 'OpenAI flagship open-weight MoE (120B total, 5.1B active). Reasoning, browser search, code execution. 131K context, 65K max output. ~500 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -99,6 +106,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
isPreview: true,
|
||||
idPrefix: 'openai/gpt-oss-safeguard-20b',
|
||||
label: 'GPT OSS Safeguard 20B (Preview)',
|
||||
pubDate: '20251029',
|
||||
description: 'OpenAI safety classification model (20B MoE). Purpose-built for content moderation with Harmony response format. 131K context, 65K max output. ~1000 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -108,6 +116,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'openai/gpt-oss-20b',
|
||||
label: 'GPT OSS 20B',
|
||||
pubDate: '20250805',
|
||||
description: 'OpenAI efficient open-weight MoE (20B total, 3.6B active). Tool use, browser search, code execution. 131K context, 65K max output. ~1000 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -120,6 +129,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'llama-3.3-70b-versatile',
|
||||
label: 'Llama 3.3 · 70B Versatile',
|
||||
pubDate: '20241206',
|
||||
description: 'Meta Llama 3.3 (70B params) with GQA. Strong reasoning, coding, multilingual. 131K context, 32K max output. ~280 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -129,6 +139,7 @@ const _knownGroqModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'llama-3.1-8b-instant',
|
||||
label: 'Llama 3.1 · 8B Instant',
|
||||
pubDate: '20240723',
|
||||
description: 'Meta Llama 3.1 (8B params). Fast, cost-effective for high-volume tasks. 131K context and max output. ~560 t/s on Groq.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 131072,
|
||||
|
||||
@@ -22,6 +22,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.7',
|
||||
label: 'MiniMax M2.7',
|
||||
pubDate: '20260318',
|
||||
description: 'Latest flagship with recursive self-improvement and agentic capabilities. 200K context, 131K max output. ~60 t/s.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 131072,
|
||||
@@ -31,6 +32,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.7-highspeed',
|
||||
label: 'MiniMax M2.7 (Highspeed)',
|
||||
pubDate: '20260318',
|
||||
description: 'Faster M2.7 variant at ~100 t/s. 200K context, 131K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 131072,
|
||||
@@ -42,6 +44,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.5',
|
||||
label: 'MiniMax M2.5',
|
||||
pubDate: '20260212',
|
||||
description: 'Strong coding and reasoning, best value. 200K context, 65K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -51,6 +54,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.5-highspeed',
|
||||
label: 'MiniMax M2.5 (Highspeed)',
|
||||
pubDate: '20260212',
|
||||
description: 'Faster M2.5 variant at ~100 t/s. 200K context, 65K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -62,6 +66,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2-her',
|
||||
label: 'MiniMax M2-her',
|
||||
pubDate: '20260127',
|
||||
description: 'Dialogue-first model for immersive roleplay, character-driven chat, and expressive multi-turn conversations. 64K context.',
|
||||
contextWindow: 65536,
|
||||
maxCompletionTokens: 2048,
|
||||
@@ -73,6 +78,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.1',
|
||||
label: 'MiniMax M2.1',
|
||||
pubDate: '20251223',
|
||||
description: '230B params (10B active), multilingual coding. 200K context, 65K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -83,6 +89,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2.1-highspeed',
|
||||
label: 'MiniMax M2.1 (Highspeed)',
|
||||
pubDate: '20251223',
|
||||
description: 'Faster M2.1 variant. 200K context, 65K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -95,6 +102,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M2',
|
||||
label: 'MiniMax M2',
|
||||
pubDate: '20251027',
|
||||
description: '230B params (10B active), agentic and reasoning. 200K context, 128K max output.',
|
||||
contextWindow: 204800,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -107,6 +115,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-M1',
|
||||
label: 'MiniMax M1',
|
||||
pubDate: '20250616',
|
||||
description: '456B total / 45.9B active MoE with lightning attention. 1M context, 40K max output.',
|
||||
contextWindow: 1000000,
|
||||
maxCompletionTokens: 40000,
|
||||
@@ -119,6 +128,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'MiniMax-01',
|
||||
label: 'MiniMax 01',
|
||||
pubDate: '20250114',
|
||||
description: 'Legacy flagship. 1M context.',
|
||||
contextWindow: 1000192,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
|
||||
|
||||
@@ -19,80 +19,81 @@ const DEV_DEBUG_MISTRAL_MODELS = Release.IsNodeDevBuild; // not in staging to re
|
||||
|
||||
const _knownMistralModelDetails: Record<string, {
|
||||
label?: string; // override the API-provided name
|
||||
pubDate?: string; // YYYYMMDD - earliest public availability (announcement / La Plateforme / HF upload)
|
||||
chatPrice?: { input: number; output: number };
|
||||
benchmark?: { cbaElo: number };
|
||||
hidden?: boolean;
|
||||
}> = {
|
||||
|
||||
// Premier models - Mistral 3 (Dec 2025)
|
||||
'mistral-large-2512': { chatPrice: { input: 0.5, output: 1.5 }, benchmark: { cbaElo: 1415 } }, // Mistral Large 3 - MoE 41B active / 675B total
|
||||
'mistral-large-2411': { chatPrice: { input: 2, output: 6 }, benchmark: { cbaElo: 1305 }, hidden: true }, // older version
|
||||
'mistral-large-latest': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // → 2512
|
||||
'mistral-large-2512': { pubDate: '20251202', chatPrice: { input: 0.5, output: 1.5 }, benchmark: { cbaElo: 1415 } }, // Mistral Large 3 - MoE 41B active / 675B total
|
||||
'mistral-large-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, benchmark: { cbaElo: 1305 }, hidden: true }, // older version
|
||||
'mistral-large-latest': { pubDate: '20251202', chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // → 2512
|
||||
|
||||
'mistral-medium-2508': { chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1410 } }, // Mistral Medium 3
|
||||
'mistral-medium-2505': { chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1387 }, hidden: true }, // older version
|
||||
'mistral-medium-latest': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // → 2508
|
||||
'mistral-medium': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
|
||||
'mistral-medium-2508': { pubDate: '20250812', chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1410 } }, // Mistral Medium 3.1
|
||||
'mistral-medium-2505': { pubDate: '20250507', chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1387 }, hidden: true }, // Mistral Medium 3
|
||||
'mistral-medium-latest': { pubDate: '20250812', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // → 2508
|
||||
'mistral-medium': { pubDate: '20231211', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink (legacy: original Mistral Medium prototype on La Plateforme beta)
|
||||
|
||||
'magistral-medium-2509': { chatPrice: { input: 2, output: 5 }, benchmark: { cbaElo: 1304 } }, // reasoning (leaderboard: magistral-medium-2506 = 1304)
|
||||
'magistral-medium-latest': { chatPrice: { input: 2, output: 5 }, hidden: true }, // symlink
|
||||
'magistral-medium-2509': { pubDate: '20250917', chatPrice: { input: 2, output: 5 }, benchmark: { cbaElo: 1304 } }, // reasoning (leaderboard: magistral-medium-2506 = 1304)
|
||||
'magistral-medium-latest': { pubDate: '20250917', chatPrice: { input: 2, output: 5 }, hidden: true }, // symlink
|
||||
|
||||
'devstral-2512': { label: 'Devstral 2 (2512)', chatPrice: { input: 0.4, output: 2 } }, // Devstral 2 - 123B coding agents (API returns "Mistral Vibe Cli")
|
||||
'devstral-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
|
||||
'devstral-medium-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
|
||||
'mistral-vibe-cli-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // alternate ID for devstral-latest
|
||||
'devstral-medium-2507': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // older version
|
||||
'devstral-2512': { label: 'Devstral 2 (2512)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 } }, // Devstral 2 - 123B coding agents (API returns "Mistral Vibe Cli")
|
||||
'devstral-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
|
||||
'devstral-medium-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
|
||||
'mistral-vibe-cli-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // alternate ID for devstral-latest
|
||||
'devstral-medium-2507': { pubDate: '20250710', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // older version
|
||||
|
||||
'mistral-large-pixtral-2411': { chatPrice: { input: 2, output: 6 } }, // Pixtral Large (alternate ID)
|
||||
'pixtral-large-2411': { chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
|
||||
'pixtral-large-latest': { chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
|
||||
'mistral-large-pixtral-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 } }, // Pixtral Large (alternate ID)
|
||||
'pixtral-large-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
|
||||
'pixtral-large-latest': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
|
||||
|
||||
'codestral-2508': { chatPrice: { input: 0.3, output: 0.9 } }, // code generation
|
||||
'codestral-latest': { chatPrice: { input: 0.3, output: 0.9 }, hidden: true }, // symlink
|
||||
'codestral-2508': { pubDate: '20250730', chatPrice: { input: 0.3, output: 0.9 } }, // code generation (Codestral 25.08)
|
||||
'codestral-latest': { pubDate: '20250730', chatPrice: { input: 0.3, output: 0.9 }, hidden: true }, // symlink
|
||||
|
||||
'voxtral-small-2507': { chatPrice: { input: 0.1, output: 0.3 } }, // voice (text tokens)
|
||||
'voxtral-small-latest': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
|
||||
'voxtral-small-2507': { pubDate: '20250715', chatPrice: { input: 0.1, output: 0.3 } }, // voice (text tokens)
|
||||
'voxtral-small-latest': { pubDate: '20250715', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
|
||||
|
||||
'voxtral-mini-2507': { chatPrice: { input: 0.04, output: 0.04 } }, // voice (text tokens)
|
||||
'voxtral-mini-latest': { chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // symlink
|
||||
'voxtral-mini-2507': { pubDate: '20250715', chatPrice: { input: 0.04, output: 0.04 } }, // voice (text tokens)
|
||||
'voxtral-mini-latest': { pubDate: '20250715', chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // symlink
|
||||
|
||||
// Ministral 3 family (Dec 2025) - multimodal, multilingual, Apache 2.0
|
||||
'ministral-14b-2512': { chatPrice: { input: 0.2, output: 0.2 } }, // Ministral 3 14B
|
||||
'ministral-14b-latest': { chatPrice: { input: 0.2, output: 0.2 }, hidden: true }, // symlink
|
||||
'ministral-14b-2512': { pubDate: '20251202', chatPrice: { input: 0.2, output: 0.2 } }, // Ministral 3 14B
|
||||
'ministral-14b-latest': { pubDate: '20251202', chatPrice: { input: 0.2, output: 0.2 }, hidden: true }, // symlink
|
||||
|
||||
'ministral-8b-2512': { chatPrice: { input: 0.15, output: 0.15 } }, // Ministral 3 8B
|
||||
'ministral-8b-2410': { chatPrice: { input: 0.1, output: 0.1 }, benchmark: { cbaElo: 1237 }, hidden: true }, // older version
|
||||
'ministral-8b-latest': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
'ministral-8b-2512': { pubDate: '20251202', chatPrice: { input: 0.15, output: 0.15 } }, // Ministral 3 8B
|
||||
'ministral-8b-2410': { pubDate: '20241016', chatPrice: { input: 0.1, output: 0.1 }, benchmark: { cbaElo: 1237 }, hidden: true }, // older version
|
||||
'ministral-8b-latest': { pubDate: '20251202', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
|
||||
'ministral-3b-2512': { chatPrice: { input: 0.1, output: 0.1 } }, // Ministral 3 3B
|
||||
'ministral-3b-2410': { chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // older version
|
||||
'ministral-3b-latest': { chatPrice: { input: 0.1, output: 0.1 }, hidden: true }, // symlink
|
||||
'ministral-3b-2512': { pubDate: '20251202', chatPrice: { input: 0.1, output: 0.1 } }, // Ministral 3 3B
|
||||
'ministral-3b-2410': { pubDate: '20241016', chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // older version
|
||||
'ministral-3b-latest': { pubDate: '20251202', chatPrice: { input: 0.1, output: 0.1 }, hidden: true }, // symlink
|
||||
|
||||
// Open models
|
||||
'mistral-small-2603': { chatPrice: { input: 0.15, output: 0.6 } }, // Mistral Small 4 - 119B hybrid (instruct+reasoning+coding), 256k ctx
|
||||
'mistral-small-2506': { chatPrice: { input: 0.1, output: 0.3 }, benchmark: { cbaElo: 1357 }, hidden: true }, // Mistral Small 3.2
|
||||
'mistral-small-latest': { chatPrice: { input: 0.15, output: 0.6 }, hidden: true }, // → 2603
|
||||
'mistral-small-2603': { pubDate: '20260316', chatPrice: { input: 0.15, output: 0.6 } }, // Mistral Small 4 - 119B hybrid (instruct+reasoning+coding), 256k ctx
|
||||
'mistral-small-2506': { pubDate: '20250620', chatPrice: { input: 0.1, output: 0.3 }, benchmark: { cbaElo: 1357 }, hidden: true }, // Mistral Small 3.2
|
||||
'mistral-small-latest': { pubDate: '20260316', chatPrice: { input: 0.15, output: 0.6 }, hidden: true }, // → 2603
|
||||
|
||||
'labs-mistral-small-creative': { label: 'Mistral Small Creative', chatPrice: { input: 0.1, output: 0.3 } }, // creative writing, roleplay (Labs)
|
||||
'labs-mistral-small-creative': { label: 'Mistral Small Creative', pubDate: '20251211', chatPrice: { input: 0.1, output: 0.3 } }, // creative writing, roleplay (Labs)
|
||||
|
||||
'labs-leanstral-2603': { label: 'Leanstral (2603)', chatPrice: { input: 0, output: 0 } }, // Lean 4 formal proof engineering (Labs, free for limited period)
|
||||
'labs-leanstral-2603': { label: 'Leanstral (2603)', pubDate: '20260316', chatPrice: { input: 0, output: 0 } }, // Lean 4 formal proof engineering (Labs, free for limited period)
|
||||
|
||||
'magistral-small-2509': { chatPrice: { input: 0.5, output: 1.5 } }, // reasoning
|
||||
'magistral-small-latest': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // symlink
|
||||
'magistral-small-2509': { pubDate: '20250917', chatPrice: { input: 0.5, output: 1.5 } }, // reasoning
|
||||
'magistral-small-latest': { pubDate: '20250917', chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // symlink
|
||||
|
||||
'labs-devstral-small-2512': { label: 'Devstral Small 2 (2512)', chatPrice: { input: 0.1, output: 0.3 } }, // Devstral Small 2 - 24B coding agents (Labs)
|
||||
'devstral-small-2507': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // older version
|
||||
'devstral-small-latest': { label: 'Devstral Small 2 (latest)', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
|
||||
'labs-devstral-small-2512': { label: 'Devstral Small 2 (2512)', pubDate: '20251209', chatPrice: { input: 0.1, output: 0.3 } }, // Devstral Small 2 - 24B coding agents (Labs)
|
||||
'devstral-small-2507': { pubDate: '20250710', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // older version (Devstral Small 1.1)
|
||||
'devstral-small-latest': { label: 'Devstral Small 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
|
||||
|
||||
'pixtral-12b-2409': { chatPrice: { input: 0.15, output: 0.15 } }, // vision
|
||||
'pixtral-12b-latest': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
'pixtral-12b': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
'pixtral-12b-2409': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 } }, // vision
|
||||
'pixtral-12b-latest': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
'pixtral-12b': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
|
||||
'open-mistral-nemo-2407': { chatPrice: { input: 0.15, output: 0.15 } }, // NeMo
|
||||
'open-mistral-nemo': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
'open-mistral-nemo-2407': { pubDate: '20240718', chatPrice: { input: 0.15, output: 0.15 } }, // NeMo
|
||||
'open-mistral-nemo': { pubDate: '20240718', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
|
||||
|
||||
// Legacy (kept for reference, no longer in API)
|
||||
'open-mistral-7b': { chatPrice: { input: 0.25, output: 0.25 }, hidden: true },
|
||||
'open-mistral-7b': { pubDate: '20230927', chatPrice: { input: 0.25, output: 0.25 }, hidden: true },
|
||||
};
|
||||
|
||||
|
||||
|
||||
@@ -28,7 +28,8 @@ const _PS_Reasoning: ModelDescriptionSchema['parameterSpecs'] = [
|
||||
* Moonshot AI (Kimi) models.
|
||||
* - models list and pricing: https://platform.kimi.ai/docs/pricing/chat (was platform.moonshot.ai - now 301 redirect)
|
||||
* - API docs: https://platform.kimi.ai/docs/api/chat
|
||||
* - updated: 2026-04-20
|
||||
* - updated: 2026-05-04
|
||||
* - NOTE: K2 series (non-2.5/2.6) is scheduled for discontinuation on 2026-05-25 per Moonshot docs.
|
||||
*/
|
||||
const _knownMoonshotModels: ManualMappings = [
|
||||
|
||||
@@ -36,6 +37,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'kimi-k2.6',
|
||||
label: 'Kimi K2.6',
|
||||
pubDate: '20260420',
|
||||
description: 'Native multimodal flagship (text, image, video inputs) with thinking and non-thinking modes. Stronger long-form coding, improved instruction compliance and self-correction. 256K context.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -49,6 +51,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'kimi-k2.5',
|
||||
label: 'Kimi K2.5',
|
||||
pubDate: '20260127',
|
||||
description: 'Supports vision (images/videos), thinking mode, and Agent tasks. 256K context.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -58,12 +61,13 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
benchmark: { cbaElo: 1451 }, // kimi-k2.5-thinking
|
||||
},
|
||||
|
||||
// Kimi K2 Series - Latest Models
|
||||
// Kimi K2 Series - scheduled for discontinuation on 2026-05-25
|
||||
|
||||
// Fast, Thinking
|
||||
{
|
||||
idPrefix: 'kimi-k2-thinking-turbo',
|
||||
label: 'Kimi K2 Thinking Turbo',
|
||||
pubDate: '20251106',
|
||||
description: 'High-speed reasoning model with advanced thinking and tool calling capabilities. Faster inference (~50 tok/s) with optimized performance. 256K context. Temperature 1.0 recommended.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -76,6 +80,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'kimi-k2-thinking',
|
||||
label: 'Kimi K2 Thinking',
|
||||
pubDate: '20251106',
|
||||
description: 'Advanced reasoning model with multi-step thinking and autonomous tool calling (200-300 sequential calls). Interleaves chain-of-thought with tool use. 256K context. Temperature 1.0 recommended.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 65536,
|
||||
@@ -89,6 +94,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'kimi-k2-0905-preview',
|
||||
label: 'Kimi K2 0905 (Preview)',
|
||||
pubDate: '20250905',
|
||||
description: 'State-of-the-art MoE model (1T total, 32B active) with extended 256K context. Enhanced agentic coding intelligence and improved instruction following.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -102,6 +108,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
hidden: true,
|
||||
idPrefix: 'kimi-k2-0711-preview',
|
||||
label: 'Kimi K2 0711 (Preview)',
|
||||
pubDate: '20250711',
|
||||
description: 'Earlier preview variant with 128K context. Superseded by 0905 version.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -114,6 +121,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'kimi-k2-turbo-preview',
|
||||
label: 'Kimi K2 Turbo (Preview)',
|
||||
pubDate: '20250801',
|
||||
description: 'High-speed variant with 60-100 tokens/second output. 256K context. Optimized for real-time applications and agentic tasks.',
|
||||
contextWindow: 262144,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -127,6 +135,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshot-v1-128k',
|
||||
label: 'V1 128K',
|
||||
pubDate: '20240206',
|
||||
description: 'Legacy V1 model with 128K context. Deprecated - use Kimi K2 Instruct instead.',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
|
||||
@@ -136,6 +145,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshot-v1-32k',
|
||||
label: 'V1 32K',
|
||||
pubDate: '20240206',
|
||||
description: 'Legacy V1 model with 32K context. Deprecated - use Kimi K2 Instruct instead.',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
|
||||
@@ -145,6 +155,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshot-v1-8k',
|
||||
label: 'V1 8K',
|
||||
pubDate: '20240206',
|
||||
description: 'Legacy V1 model with 8K context. Deprecated - use Kimi K2 Instruct instead.',
|
||||
contextWindow: 8192,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
|
||||
@@ -157,6 +168,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
// hidden: false, not hidden - only non-hidden vision for now
|
||||
idPrefix: 'moonshot-v1-128k-vision-preview',
|
||||
label: 'V1 128K Vision (Preview)',
|
||||
pubDate: '20250115',
|
||||
description: 'Legacy vision model with 128K context. Preview variant - use moonshot-v1-vision for production.',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -166,6 +178,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshot-v1-32k-vision-preview',
|
||||
label: 'V1 32K Vision (Preview)',
|
||||
pubDate: '20250115',
|
||||
description: 'Legacy vision model with 32K context. Preview variant - use moonshot-v1-vision for production.',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -176,6 +189,7 @@ const _knownMoonshotModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'moonshot-v1-8k-vision-preview',
|
||||
label: 'V1 8K Vision (Preview)',
|
||||
pubDate: '20250115',
|
||||
description: 'Legacy vision model with 8K context. Preview variant - use moonshot-v1-vision for production.',
|
||||
contextWindow: 8192,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
|
||||
@@ -111,6 +111,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.5-2026-04-23',
|
||||
label: 'GPT-5.5 (2026-04-23)',
|
||||
pubDate: '20260423',
|
||||
description: 'New baseline for complex production workflows. Stronger task execution, more precise tool use, more efficient reasoning with fewer tokens. 1M token context.',
|
||||
contextWindow: 1050000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -136,6 +137,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.5-pro-2026-04-23',
|
||||
label: 'GPT-5.5 Pro (2026-04-23)',
|
||||
pubDate: '20260423',
|
||||
description: 'Most capable model for complex tasks. Uses more compute for smarter, more precise responses on the hardest problems.',
|
||||
contextWindow: 1050000,
|
||||
maxCompletionTokens: 272000,
|
||||
@@ -163,6 +165,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.4-2026-03-05',
|
||||
label: 'GPT-5.4 (2026-03-05)',
|
||||
pubDate: '20260305',
|
||||
description: 'Most capable and efficient frontier model for professional work. Native computer use, improved reasoning, coding, and agentic workflows with 1M token context.',
|
||||
contextWindow: 1050000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -188,6 +191,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.4-pro-2026-03-05',
|
||||
label: 'GPT-5.4 Pro (2026-03-05)',
|
||||
pubDate: '20260305',
|
||||
description: 'Most capable model for complex tasks. Uses more compute for smarter, more precise responses on difficult problems.',
|
||||
contextWindow: 1050000,
|
||||
maxCompletionTokens: 272000,
|
||||
@@ -212,6 +216,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.4-mini-2026-03-17',
|
||||
label: 'GPT-5.4 Mini (2026-03-17)',
|
||||
pubDate: '20260317',
|
||||
description: 'Strongest mini model for coding, computer use, and subagents. GPT-5.4-class intelligence at lower cost and latency.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -237,6 +242,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.4-nano-2026-03-17',
|
||||
label: 'GPT-5.4 Nano (2026-03-17)',
|
||||
pubDate: '20260317',
|
||||
description: 'Cheapest GPT-5.4-class model for simple high-volume tasks like classification and data extraction.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -265,6 +271,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-5.3-codex',
|
||||
label: 'GPT-5.3 Codex',
|
||||
pubDate: '20260205',
|
||||
description: 'Most capable agentic coding model. Combines frontier coding performance of GPT-5.2-Codex with reasoning and professional knowledge of GPT-5.2. ~25% faster.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -285,6 +292,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // Research preview, ChatGPT Pro only - API access limited to design partners
|
||||
idPrefix: 'gpt-5.3-codex-spark',
|
||||
label: 'GPT-5.3 Codex Spark',
|
||||
pubDate: '20260212',
|
||||
description: 'Text-only research preview optimized for real-time coding iteration. Delivers 1000+ tokens/sec on low-latency hardware.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -297,10 +305,11 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
// benchmark: TBD
|
||||
},
|
||||
|
||||
// GPT-5.3 Chat Latest - Released March 4, 2026
|
||||
// GPT-5.3 Chat Latest - Released March 3, 2026
|
||||
{
|
||||
idPrefix: 'gpt-5.3-chat-latest',
|
||||
label: 'GPT-5.3 Instant',
|
||||
pubDate: '20260303',
|
||||
description: 'GPT-5.3 model powering ChatGPT. Points to the GPT-5.3 Instant snapshot currently used in ChatGPT.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -322,6 +331,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4/5.5
|
||||
idPrefix: 'gpt-5.2-2025-12-11',
|
||||
label: 'GPT-5.2 (2025-12-11)',
|
||||
pubDate: '20251211',
|
||||
description: 'Most capable model for professional work and long-running agents. Improvements in general intelligence, long-context, agentic tool-calling, and vision.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -349,6 +359,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Codex
|
||||
idPrefix: 'gpt-5.2-codex',
|
||||
label: 'GPT-5.2 Codex',
|
||||
pubDate: '20251211',
|
||||
description: 'GPT-5.2 optimized for long-horizon, agentic coding tasks in Codex or similar environments. Supports low, medium, high, and xhigh reasoning effort settings.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -368,6 +379,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Instant
|
||||
idPrefix: 'gpt-5.2-chat-latest',
|
||||
label: 'GPT-5.2 Instant',
|
||||
pubDate: '20251211',
|
||||
description: 'GPT-5.2 model powering ChatGPT. Fast, capable for everyday work with clear improvements in info-seeking, how-tos, technical writing.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -387,6 +399,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4/5.5 Pro
|
||||
idPrefix: 'gpt-5.2-pro-2025-12-11',
|
||||
label: 'GPT-5.2 Pro (2025-12-11)',
|
||||
pubDate: '20251211',
|
||||
description: 'Smartest and most trustworthy option for difficult questions. Uses more compute for harder thinking on complex domains like programming.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 272000,
|
||||
@@ -416,6 +429,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4/5.5
|
||||
idPrefix: 'gpt-5.1-2025-11-13',
|
||||
label: 'GPT-5.1 (2025-11-13)',
|
||||
pubDate: '20251113',
|
||||
description: 'The best model for coding and agentic tasks with configurable reasoning effort.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -442,6 +456,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Instant
|
||||
idPrefix: 'gpt-5.1-chat-latest',
|
||||
label: 'GPT-5.1 Instant',
|
||||
pubDate: '20251112',
|
||||
description: 'GPT-5.1 Instant with adaptive reasoning. More conversational with improved instruction following.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -462,6 +477,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Codex
|
||||
idPrefix: 'gpt-5.1-codex-max',
|
||||
label: 'GPT-5.1 Codex Max',
|
||||
pubDate: '20251119',
|
||||
description: 'Our most intelligent coding model optimized for long-horizon, agentic coding tasks.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -480,6 +496,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Codex
|
||||
idPrefix: 'gpt-5.1-codex',
|
||||
label: 'GPT-5.1 Codex',
|
||||
pubDate: '20251113',
|
||||
description: 'A version of GPT-5.1 optimized for agentic coding tasks in Codex or similar environments.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -498,6 +515,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.3 Codex
|
||||
idPrefix: 'gpt-5.1-codex-mini',
|
||||
label: 'GPT-5.1 Codex Mini',
|
||||
pubDate: '20251113',
|
||||
description: 'Smaller, faster version of GPT-5.1 Codex for efficient coding tasks.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -520,6 +538,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4/5.5
|
||||
idPrefix: 'gpt-5-2025-08-07',
|
||||
label: 'GPT-5 (2025-08-07)',
|
||||
pubDate: '20250807',
|
||||
description: 'The best model for coding and agentic tasks across domains.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -546,6 +565,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4/5.5 Pro
|
||||
idPrefix: 'gpt-5-pro-2025-10-06',
|
||||
label: 'GPT-5 Pro (2025-10-06)',
|
||||
pubDate: '20251006',
|
||||
description: 'Version of GPT-5 that uses more compute to produce smarter and more precise responses. Designed for tough problems.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 272000,
|
||||
@@ -566,6 +586,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // deprecated per OpenAI docs (2026-04)
|
||||
idPrefix: 'gpt-5-chat-latest',
|
||||
label: 'GPT-5 ChatGPT (Non-Thinking)',
|
||||
pubDate: '20250807',
|
||||
description: 'GPT-5 model used in ChatGPT. Points to the GPT-5 snapshot currently used in ChatGPT.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -580,6 +601,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // deprecated per OpenAI docs (2026-04), superseded by gpt-5.1-codex/gpt-5.3-codex
|
||||
idPrefix: 'gpt-5-codex',
|
||||
label: 'GPT-5 Codex',
|
||||
pubDate: '20250915',
|
||||
description: 'A version of GPT-5 optimized for agentic coding in Codex.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -599,6 +621,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // poor quality - use llmVndOaiWebSearchContext on regular models instead
|
||||
idPrefix: 'gpt-5-search-api-2025-10-14',
|
||||
label: 'GPT-5 Search API (2025-10-14)',
|
||||
pubDate: '20251014',
|
||||
description: 'Updated web search model in Chat Completions API. 60% cheaper with domain filtering support.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 100000,
|
||||
@@ -619,6 +642,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4 Mini
|
||||
idPrefix: 'gpt-5-mini-2025-08-07',
|
||||
label: 'GPT-5 Mini (2025-08-07)',
|
||||
pubDate: '20250807',
|
||||
description: 'A faster, more cost-efficient version of GPT-5 for well-defined tasks.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -639,6 +663,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT-5.4 Nano
|
||||
idPrefix: 'gpt-5-nano-2025-08-07',
|
||||
label: 'GPT-5 Nano (2025-08-07)',
|
||||
pubDate: '20250807',
|
||||
description: 'Fastest, most cost-efficient version of GPT-5 for summarization and classification tasks.',
|
||||
contextWindow: 400000,
|
||||
maxCompletionTokens: 128000,
|
||||
@@ -679,6 +704,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // UNSUPPORTED YET
|
||||
idPrefix: 'computer-use-preview-2025-03-11',
|
||||
label: 'Computer Use Preview (2025-03-11)',
|
||||
pubDate: '20250311',
|
||||
description: 'Specialized model for computer use tool. Optimized for computer interaction capabilities.',
|
||||
contextWindow: 8192,
|
||||
maxCompletionTokens: 1024,
|
||||
@@ -700,6 +726,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o4-mini-deep-research-2025-06-26',
|
||||
label: 'o4 Mini Deep Research [Deprecated]',
|
||||
pubDate: '20250626',
|
||||
isLegacy: true,
|
||||
description: 'Faster, more affordable deep research model for complex, multi-step research tasks. [Shutdown: 2026-07-23 - migrate to GPT-5.5 with web search.]',
|
||||
contextWindow: 200000,
|
||||
@@ -718,6 +745,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o4-mini-2025-04-16',
|
||||
label: 'o4 Mini [Deprecated]',
|
||||
pubDate: '20250416',
|
||||
isLegacy: true,
|
||||
description: 'Latest o4-mini model. Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Mini.]',
|
||||
contextWindow: 200000,
|
||||
@@ -737,6 +765,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o3-deep-research-2025-06-26',
|
||||
label: 'o3 Deep Research [Deprecated]',
|
||||
pubDate: '20250626',
|
||||
isLegacy: true,
|
||||
description: 'Our most powerful deep research model for complex, multi-step research tasks. [Shutdown: 2026-07-23 - migrate to GPT-5.5 Pro with web search.]',
|
||||
contextWindow: 200000,
|
||||
@@ -755,6 +784,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o3-pro-2025-06-10',
|
||||
label: 'o3 Pro (2025-06-10)',
|
||||
pubDate: '20250610',
|
||||
description: 'Version of o3 with more compute for better responses. Provides consistently better answers for complex tasks.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 100000,
|
||||
@@ -773,6 +803,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o3-2025-04-16',
|
||||
label: 'o3 (2025-04-16)',
|
||||
pubDate: '20250416',
|
||||
description: 'A well-rounded and powerful model across domains. Sets a new standard for math, science, coding, and visual reasoning tasks.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 100000,
|
||||
@@ -791,6 +822,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o3-mini-2025-01-31',
|
||||
label: 'o3 Mini [Deprecated]',
|
||||
pubDate: '20250131',
|
||||
isLegacy: true,
|
||||
description: 'Latest o3-mini model snapshot. High intelligence at the same cost and latency targets of o1-mini. Excels at science, math, and coding tasks. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Mini.]',
|
||||
contextWindow: 200000,
|
||||
@@ -811,6 +843,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true,
|
||||
idPrefix: 'o1-pro-2025-03-19',
|
||||
label: 'o1 Pro (2025-03-19)',
|
||||
pubDate: '20250319',
|
||||
description: 'A version of o1 with more compute for better responses. Provides consistently better answers for complex tasks.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 100000,
|
||||
@@ -829,6 +862,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'o1-2024-12-17',
|
||||
label: 'o1 [Deprecated]',
|
||||
pubDate: '20241217',
|
||||
isLegacy: true,
|
||||
description: 'Previous full o-series reasoning model. [Shutdown: 2026-10-23 - migrate to GPT-5.5 or o3.]',
|
||||
contextWindow: 200000,
|
||||
@@ -851,6 +885,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4.1-2025-04-14',
|
||||
label: 'GPT-4.1 (2025-04-14)',
|
||||
pubDate: '20250414',
|
||||
description: 'Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.',
|
||||
contextWindow: 1047576,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -868,6 +903,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4.1-mini-2025-04-14',
|
||||
label: 'GPT-4.1 Mini (2025-04-14)',
|
||||
pubDate: '20250414',
|
||||
description: 'Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%.',
|
||||
contextWindow: 1047576,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -885,6 +921,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4.1-nano-2025-04-14',
|
||||
label: 'GPT-4.1 Nano [Deprecated]',
|
||||
pubDate: '20250414',
|
||||
isLegacy: true,
|
||||
description: 'Fastest, most cost-effective GPT 4.1 model. Delivers exceptional performance with low latency, ideal for tasks like classification or autocompletion. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Nano.]',
|
||||
contextWindow: 1047576,
|
||||
@@ -906,6 +943,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-audio-1.5',
|
||||
label: 'GPT Audio 1.5',
|
||||
pubDate: '20260224',
|
||||
description: 'Best voice model for audio in, audio out with Chat Completions. Accepts audio inputs and outputs.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -919,6 +957,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // superseded by GPT Audio 1.5
|
||||
idPrefix: 'gpt-audio-2025-08-28',
|
||||
label: 'GPT Audio (2025-08-28)',
|
||||
pubDate: '20250828',
|
||||
description: 'First generally available audio model. Accepts audio inputs and outputs, and can be used in the Chat Completions REST API.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -935,6 +974,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-audio-mini-2025-12-15',
|
||||
label: 'GPT Audio Mini (2025-12-15)',
|
||||
pubDate: '20251215',
|
||||
description: 'Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -944,6 +984,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-audio-mini-2025-10-06',
|
||||
label: 'GPT Audio Mini (2025-10-06)',
|
||||
pubDate: '20251006',
|
||||
hidden: true, // previous version
|
||||
description: 'Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.',
|
||||
contextWindow: 128000,
|
||||
@@ -966,6 +1007,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4o-2024-11-20',
|
||||
label: 'GPT-4o (2024-11-20)',
|
||||
pubDate: '20241120',
|
||||
description: 'Snapshot of gpt-4o from November 20th, 2024.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -976,6 +1018,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4o-2024-08-06',
|
||||
label: 'GPT-4o (2024-08-06)',
|
||||
pubDate: '20240806',
|
||||
hidden: true, // previous version
|
||||
description: 'Snapshot that supports Structured Outputs. gpt-4o currently points to this version.',
|
||||
contextWindow: 128000,
|
||||
@@ -987,6 +1030,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4o-2024-05-13',
|
||||
label: 'GPT-4o (2024-05-13)',
|
||||
pubDate: '20240513',
|
||||
hidden: true, // previous version
|
||||
description: 'Original gpt-4o snapshot from May 13, 2024.',
|
||||
contextWindow: 128000,
|
||||
@@ -1007,6 +1051,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // old
|
||||
idPrefix: 'gpt-4o-search-preview-2025-03-11',
|
||||
label: 'GPT-4o Search Preview (2025-03-11)',
|
||||
pubDate: '20250311',
|
||||
description: 'Latest snapshot of the GPT-4o model optimized for web search capabilities.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1027,6 +1072,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // old
|
||||
idPrefix: 'gpt-4o-audio-preview-2025-06-03',
|
||||
label: 'GPT-4o Audio Preview (2025-06-03)',
|
||||
pubDate: '20250603',
|
||||
description: 'Latest snapshot for the Audio API model.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1039,6 +1085,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // old
|
||||
idPrefix: 'gpt-4o-audio-preview-2024-12-17',
|
||||
label: 'GPT-4o Audio Preview (2024-12-17)',
|
||||
pubDate: '20241217',
|
||||
description: 'Snapshot for the Audio API model.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1057,6 +1104,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4o-mini-2024-07-18',
|
||||
label: 'GPT-4o Mini (2024-07-18)',
|
||||
pubDate: '20240718',
|
||||
description: 'Affordable model for fast, lightweight tasks. GPT-4o Mini is cheaper and more capable than GPT-3.5 Turbo.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1073,6 +1121,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // UNSUPPORTED yet (audio output model)
|
||||
idPrefix: 'gpt-4o-mini-audio-preview-2024-12-17',
|
||||
label: 'GPT-4o Mini Audio Preview (2024-12-17)',
|
||||
pubDate: '20241217',
|
||||
description: 'Snapshot for the Audio API model.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1091,6 +1140,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
hidden: true, // old
|
||||
idPrefix: 'gpt-4o-mini-search-preview-2025-03-11',
|
||||
label: 'GPT-4o Mini Search Preview (2025-03-11)',
|
||||
pubDate: '20250311',
|
||||
description: 'Latest snapshot of the GPT-4o Mini model optimized for web search capabilities.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -1110,6 +1160,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4-turbo-2024-04-09',
|
||||
label: 'GPT-4 Turbo (2024-04-09)',
|
||||
pubDate: '20240409',
|
||||
hidden: true, // OLD
|
||||
description: 'GPT-4 Turbo with Vision model. Vision requests can now use JSON mode and function calling. gpt-4-turbo currently points to this version.',
|
||||
contextWindow: 128000,
|
||||
@@ -1126,6 +1177,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4-0125-preview',
|
||||
label: 'GPT-4 Turbo (0125)',
|
||||
pubDate: '20240125',
|
||||
hidden: true, // OLD
|
||||
description: 'GPT-4 Turbo preview model intended to reduce cases of "laziness" where the model doesn\'t complete a task.',
|
||||
contextWindow: 128000,
|
||||
@@ -1137,6 +1189,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4-1106-preview', // GPT-4 Turbo preview model
|
||||
label: 'GPT-4 Turbo (1106)',
|
||||
pubDate: '20231106',
|
||||
hidden: true, // OLD
|
||||
description: 'GPT-4 Turbo preview model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
|
||||
contextWindow: 128000,
|
||||
@@ -1156,6 +1209,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4-0613',
|
||||
label: 'GPT-4 (0613)',
|
||||
pubDate: '20230613',
|
||||
hidden: true, // OLD
|
||||
description: 'Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.',
|
||||
contextWindow: 8192,
|
||||
@@ -1167,6 +1221,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-4-0314',
|
||||
label: 'GPT-4 (0314)',
|
||||
pubDate: '20230314',
|
||||
hidden: true, // OLD
|
||||
description: 'Snapshot of gpt-4 from March 14th 2023 with function calling data. Data up to Sep 2021.',
|
||||
contextWindow: 8192,
|
||||
@@ -1189,6 +1244,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-3.5-turbo-0125',
|
||||
label: '3.5-Turbo (2024-01-25)',
|
||||
pubDate: '20240125',
|
||||
hidden: true, // OLD
|
||||
description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.',
|
||||
contextWindow: 16385,
|
||||
@@ -1200,6 +1256,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'gpt-3.5-turbo-1106',
|
||||
label: '3.5-Turbo (1106)',
|
||||
pubDate: '20231106',
|
||||
hidden: true, // OLD
|
||||
description: 'GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
|
||||
contextWindow: 16385,
|
||||
@@ -1559,5 +1616,5 @@ export function llmOrtOaiLookup(orModelName: string): OrtVendorLookupResult | un
|
||||
|
||||
// initialTemperature: not set - OpenAI models use the global fallback (0.5);
|
||||
// NoTemperature models are handled client-side via LLM_IF_HOTFIX_NoTemperature (not propagated to OR)
|
||||
return { interfaces, parameterSpecs };
|
||||
return { interfaces, parameterSpecs, pubDate: entry.pubDate };
|
||||
}
|
||||
|
||||
@@ -12,6 +12,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gpt-4.1-2025-04-14',
|
||||
label: '💾➜ GPT-4.1 (2025-04-14)',
|
||||
pubDate: '20250414',
|
||||
description: 'Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.',
|
||||
contextWindow: 1047576,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -22,6 +23,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gpt-4.1-mini-2025-04-14',
|
||||
label: '💾➜ GPT-4.1 Mini (2025-04-14)',
|
||||
pubDate: '20250414',
|
||||
description: 'Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency and cost.',
|
||||
contextWindow: 1047576,
|
||||
maxCompletionTokens: 32768,
|
||||
@@ -32,6 +34,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gpt-4o-mini-2024-07-18',
|
||||
label: '💾➜ GPT-4o Mini (2024-07-18)',
|
||||
pubDate: '20240718',
|
||||
description: 'Affordable model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -41,6 +44,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gpt-4o-2024-08-06',
|
||||
label: '💾➜ GPT-4o (2024-08-06)',
|
||||
pubDate: '20240806',
|
||||
description: 'Advanced, multimodal flagship model that\'s cheaper and faster than GPT-4 Turbo.',
|
||||
contextWindow: 128000,
|
||||
maxCompletionTokens: 16384,
|
||||
@@ -51,6 +55,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gpt-3.5-turbo-0125',
|
||||
label: '💾➜ GPT-3.5 Turbo (0125)',
|
||||
pubDate: '20240125',
|
||||
description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats',
|
||||
contextWindow: 16385,
|
||||
maxCompletionTokens: 4096,
|
||||
@@ -63,6 +68,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gemini-1.0-pro-001',
|
||||
label: '💾➜ Gemini 1.0 Pro',
|
||||
pubDate: '20240215',
|
||||
description: 'Google\'s Gemini 1.0 Pro model',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
|
||||
@@ -70,6 +76,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'gemini-1.5-flash-001',
|
||||
label: '💾➜ Gemini 1.5 Flash',
|
||||
pubDate: '20240514',
|
||||
description: 'Google\'s Gemini 1.5 Flash model - fast and efficient',
|
||||
contextWindow: 1000000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
|
||||
@@ -79,6 +86,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Meta-Llama-3.1-8B-Instruct',
|
||||
label: '💾 Llama 3.1 · 8B Instruct',
|
||||
pubDate: '20240723',
|
||||
description: 'Meta Llama 3.1 8B Instruct - hosted inference with per-token pricing',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -87,6 +95,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Meta-Llama-3.1-70B-Instruct',
|
||||
label: '💾 Llama 3.1 · 70B Instruct',
|
||||
pubDate: '20240723',
|
||||
description: 'Meta Llama 3.1 70B Instruct - hosted inference with per-token pricing',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -95,6 +104,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Llama-3.1-8B',
|
||||
label: '💾 Llama 3.1 · 8B Base',
|
||||
pubDate: '20240723',
|
||||
description: 'Meta Llama 3.1 8B base model for fine-tuning',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -102,6 +112,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Llama-3.1-70B',
|
||||
label: '💾 Llama 3.1 · 70B Base',
|
||||
pubDate: '20240723',
|
||||
description: 'Meta Llama 3.1 70B base model for fine-tuning',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -111,6 +122,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Llama-3.2-1B-Instruct',
|
||||
label: '💾 Llama 3.2 · 1B Instruct',
|
||||
pubDate: '20240925',
|
||||
description: 'Meta Llama 3.2 1B Instruct - lightweight model for edge and mobile deployment',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -118,6 +130,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Llama-3.2-3B-Instruct',
|
||||
label: '💾 Llama 3.2 · 3B Instruct',
|
||||
pubDate: '20240925',
|
||||
description: 'Meta Llama 3.2 3B Instruct - efficient model for edge deployment',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -127,6 +140,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'meta-llama/Llama-3.3-70B-Instruct',
|
||||
label: '💾 Llama 3.3 · 70B Instruct',
|
||||
pubDate: '20241206',
|
||||
description: 'Meta Llama 3.3 70B Instruct - latest 70B model with performance comparable to Llama 3.1 405B',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -136,6 +150,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2-VL-7B-Instruct',
|
||||
label: '💾 Qwen 2 · VL 7B Instruct',
|
||||
pubDate: '20240830',
|
||||
description: 'Alibaba Qwen 2 Vision-Language 7B Instruct - multimodal model for text and image understanding',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -145,6 +160,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-1.5B-Instruct',
|
||||
label: '💾 Qwen 2.5 · 1.5B Instruct',
|
||||
pubDate: '20240919',
|
||||
description: 'Alibaba Qwen 2.5 1.5B Instruct - efficient small model',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -152,6 +168,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-7B-Instruct',
|
||||
label: '💾 Qwen 2.5 · 7B Instruct',
|
||||
pubDate: '20240919',
|
||||
description: 'Alibaba Qwen 2.5 7B Instruct - balanced performance and efficiency',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -159,6 +176,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-14B-Instruct',
|
||||
label: '💾 Qwen 2.5 · 14B Instruct',
|
||||
pubDate: '20240919',
|
||||
description: 'Alibaba Qwen 2.5 14B Instruct - hosted inference (hourly compute unit pricing)',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -166,6 +184,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-72B-Instruct',
|
||||
label: '💾 Qwen 2.5 · 72B Instruct',
|
||||
pubDate: '20240919',
|
||||
description: 'Alibaba Qwen 2.5 72B Instruct - flagship model with performance comparable to Llama 3.1 405B',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -173,6 +192,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-Coder-7B-Instruct',
|
||||
label: '💾 Qwen 2.5 · Coder 7B Instruct',
|
||||
pubDate: '20241112',
|
||||
description: 'Alibaba Qwen 2.5 Coder 7B Instruct - specialized for code generation and understanding',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -180,6 +200,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen2.5-Coder-32B-Instruct',
|
||||
label: '💾 Qwen 2.5 · Coder 32B Instruct',
|
||||
pubDate: '20241112',
|
||||
description: 'Alibaba Qwen 2.5 Coder 32B Instruct - specialized for code generation and understanding',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
|
||||
@@ -189,6 +210,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen3-8B',
|
||||
label: '💾 Qwen 3 · 8B Base',
|
||||
pubDate: '20250429',
|
||||
description: 'Alibaba Qwen 3 8B base model for fine-tuning - supports thinking and non-thinking modes',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -196,6 +218,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'Qwen/Qwen3-14B',
|
||||
label: '💾 Qwen 3 · 14B Base',
|
||||
pubDate: '20250429',
|
||||
description: 'Alibaba Qwen 3 14B base model for fine-tuning - supports thinking and non-thinking modes',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -205,6 +228,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'google/gemma-3-1b-it',
|
||||
label: '💾 Gemma 3 · 1B IT',
|
||||
pubDate: '20250312',
|
||||
description: 'Google Gemma 3 1B instruction-tuned - lightweight text-only model',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -212,6 +236,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'google/gemma-3-4b-it',
|
||||
label: '💾 Gemma 3 · 4B IT',
|
||||
pubDate: '20250312',
|
||||
description: 'Google Gemma 3 4B instruction-tuned - efficient multimodal model with 128K context',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -219,6 +244,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'google/gemma-3-12b-it',
|
||||
label: '💾 Gemma 3 · 12B IT',
|
||||
pubDate: '20250312',
|
||||
description: 'Google Gemma 3 12B instruction-tuned - balanced multimodal model with 128K context',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -226,6 +252,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'google/gemma-3-27b-it',
|
||||
label: '💾 Gemma 3 · 27B IT',
|
||||
pubDate: '20250312',
|
||||
description: 'Google Gemma 3 27B instruction-tuned - largest Gemma 3 multimodal model with 128K context',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
|
||||
@@ -235,6 +262,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'mistralai/Mistral-Nemo-Base-2407',
|
||||
label: '💾 Mistral Nemo · Base',
|
||||
pubDate: '20240718',
|
||||
description: 'Mistral Nemo 12B base model (July 2024) for fine-tuning',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
@@ -242,6 +270,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'mistralai/Mistral-Small-24B-Base-2501',
|
||||
label: '💾 Mistral Small · 24B Base',
|
||||
pubDate: '20250130',
|
||||
description: 'Mistral Small 24B base model (Jan 2025) - competitive with larger models while faster',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
|
||||
@@ -162,8 +162,11 @@ export function openRouterModelToModelDescription(wireModel: object): ModelDescr
|
||||
// -- Vendor parameter & interface inheritance --
|
||||
const llmRef = model.id.replace(/^[^/]+\//, '');
|
||||
let initialTemperature: number | undefined;
|
||||
let pubDate: string | undefined;
|
||||
|
||||
const _mergeLookup = (lookup: OrtVendorLookupResult | undefined) => {
|
||||
if (lookup?.pubDate !== undefined)
|
||||
pubDate = lookup.pubDate;
|
||||
if (lookup?.interfaces)
|
||||
for (const iface of lookup.interfaces)
|
||||
if (!interfaces.includes(iface))
|
||||
@@ -270,6 +273,7 @@ export function openRouterModelToModelDescription(wireModel: object): ModelDescr
|
||||
idPrefix: model.id,
|
||||
// latest: ...
|
||||
label,
|
||||
...(pubDate !== undefined && { pubDate }),
|
||||
description: model.description?.length > 280 ? model.description.slice(0, 277) + '...' : model.description,
|
||||
contextWindow,
|
||||
maxCompletionTokens,
|
||||
|
||||
@@ -39,6 +39,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'sonar-deep-research',
|
||||
label: 'Sonar Deep Research',
|
||||
pubDate: '20250214',
|
||||
description: 'Expert-level research model for exhaustive searches and comprehensive reports. 128k context.',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning],
|
||||
@@ -59,6 +60,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'sonar-reasoning-pro',
|
||||
label: 'Sonar Reasoning Pro',
|
||||
pubDate: '20250218',
|
||||
description: 'Premier reasoning model (DeepSeek R1) with Chain of Thought. 128k context.',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning],
|
||||
@@ -78,6 +80,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'sonar-pro',
|
||||
label: 'Sonar Pro',
|
||||
pubDate: '20250121',
|
||||
description: 'Advanced search model for complex queries and deep content understanding. 200k context.',
|
||||
contextWindow: 200000,
|
||||
maxCompletionTokens: 8000,
|
||||
@@ -96,6 +99,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
|
||||
{
|
||||
id: 'sonar',
|
||||
label: 'Sonar',
|
||||
pubDate: '20250121',
|
||||
description: 'Lightweight, cost-effective search model for quick, grounded answers. 128k context.',
|
||||
contextWindow: 128000,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
|
||||
@@ -93,6 +93,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-4.3',
|
||||
label: 'Grok 4.3',
|
||||
pubDate: '20260417',
|
||||
description: 'xAI\'s latest flagship model with always-on reasoning and a 1M token context window. Supports text, image, and video inputs with improved agentic performance at lower cost.',
|
||||
contextWindow: 1000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -107,6 +108,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
hidden: true, // yield to 4.3
|
||||
idPrefix: 'grok-4.20-0309-reasoning',
|
||||
label: 'Grok 4.20 Reasoning',
|
||||
pubDate: '20260309',
|
||||
description: 'xAI\'s previous flagship reasoning model with a 2M token context window. Deep reasoning and problem-solving capabilities with text and image inputs.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -119,6 +121,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
hidden: true, // yield to 4.3
|
||||
idPrefix: 'grok-4.20-0309-non-reasoning',
|
||||
label: 'Grok 4.20',
|
||||
pubDate: '20260309',
|
||||
description: 'xAI\'s previous flagship model with a 2M token context window. Non-reasoning variant for fast, high-quality responses with text and image inputs.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -130,6 +133,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-4.20-multi-agent-0309',
|
||||
label: 'Grok 4.20 Multi-Agent',
|
||||
pubDate: '20260309',
|
||||
description: 'Multi-agent reasoning model that runs 4 specialized agents in parallel (coordinator, fact-checker, analyst, challenger) for collaborative verification with reduced hallucination.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -147,6 +151,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-4-1-fast-reasoning',
|
||||
label: 'Grok 4.1 Fast Reasoning',
|
||||
pubDate: '20251119',
|
||||
description: 'Next generation frontier multimodal model optimized for high-performance agentic tool calling with a 2M token context window. Trained specifically for real-world enterprise use cases with exceptional performance on agentic workflows.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -158,6 +163,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-4-1-fast-non-reasoning',
|
||||
label: 'Grok 4.1 Fast', // 'Grok 4.1 Fast Non-Reasoning'
|
||||
pubDate: '20251119',
|
||||
description: 'Next generation frontier multimodal model optimized for high-performance agentic tool calling with a 2M token context window. Non-reasoning variant for instant responses.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -172,6 +178,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
hidden: true, // yield to 4.1
|
||||
idPrefix: 'grok-4-fast-reasoning',
|
||||
label: 'Grok 4 Fast Reasoning',
|
||||
pubDate: '20250919',
|
||||
description: 'Cost-efficient reasoning model with a 2M token context window. Optimized for fast reasoning in agentic workflows. 98% cost reduction vs Grok 4 with comparable performance.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -184,6 +191,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
hidden: true, // yield to 4.1
|
||||
idPrefix: 'grok-4-fast-non-reasoning',
|
||||
label: 'Grok 4 Fast', // 'Grok 4 Fast Non-Reasoning'
|
||||
pubDate: '20250919',
|
||||
description: 'Cost-efficient non-reasoning model with a 2M token context window. Same weights as grok-4-fast-reasoning but constrained by non-reasoning system prompt for quick responses.',
|
||||
contextWindow: 2000000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -196,6 +204,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
hidden: true, // yield to 4.20
|
||||
idPrefix: 'grok-4-0709',
|
||||
label: 'Grok 4 (0709)',
|
||||
pubDate: '20250709',
|
||||
description: 'xAI\'s most advanced model, offering state-of-the-art reasoning and problem-solving capabilities over a massive 256k context window. Supports text and image inputs.',
|
||||
contextWindow: 256000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -209,6 +218,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-3',
|
||||
label: 'Grok 3',
|
||||
pubDate: '20250217',
|
||||
description: 'xAI flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -220,6 +230,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-3-mini',
|
||||
label: 'Grok 3 Mini',
|
||||
pubDate: '20250217',
|
||||
description: 'A lightweight model that is fast and smart for logic-based tasks. Supports function calling and structured outputs.',
|
||||
contextWindow: 131072,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -236,6 +247,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-code-fast-1',
|
||||
label: 'Grok Code Fast 1',
|
||||
pubDate: '20250828',
|
||||
description: 'Specialized reasoning model for agentic coding workflows. Fast, economical, and optimized for code generation, debugging, and software development tasks.',
|
||||
contextWindow: 256000,
|
||||
maxCompletionTokens: undefined,
|
||||
@@ -249,6 +261,7 @@ const _knownXAIChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'grok-2-vision-1212',
|
||||
label: 'Grok 2 Vision (1212)',
|
||||
pubDate: '20241212',
|
||||
description: 'xAI model grok-2-vision-1212 with image and text input capabilities. Supports text generation with a 32,768 token context window.',
|
||||
contextWindow: 32768,
|
||||
maxCompletionTokens: undefined,
|
||||
|
||||
@@ -32,6 +32,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-5',
|
||||
label: 'GLM-5',
|
||||
pubDate: '20260211',
|
||||
description: 'Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.',
|
||||
contextWindow: 204800, // 200K
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -43,6 +44,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-5-code',
|
||||
label: 'GLM-5 Code',
|
||||
// pubDate: UNCONFIRMED - 'glm-5-code' not in Z.ai pricing table or release-notes; Z.ai's coding plan documents GLM-5.1 / GLM-5-Turbo / GLM-4.7 / GLM-4.5-Air, no 'glm-5-code'
|
||||
description: 'GLM-5 optimized for coding tasks. Uses the dedicated Coding endpoint. 200K context, thinking mode.',
|
||||
contextWindow: 204800, // 200K
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -58,6 +60,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.7',
|
||||
label: 'GLM-4.7',
|
||||
pubDate: '20251222',
|
||||
description: 'Latest-gen GLM model with 128K context. Thinking mode activated by default.',
|
||||
contextWindow: 131072, // 128K
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -69,6 +72,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.7-flashx',
|
||||
label: 'GLM-4.7 FlashX', // fast, low cost
|
||||
pubDate: '20260119',
|
||||
description: 'Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -80,6 +84,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.7-flash',
|
||||
label: 'GLM-4.7 Flash (Free)',
|
||||
pubDate: '20260119',
|
||||
description: 'Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -94,6 +99,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.6v-flashx',
|
||||
label: 'GLM-4.6 V FlashX',
|
||||
pubDate: '20251208',
|
||||
description: 'Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Vision_Reasoning,
|
||||
@@ -106,6 +112,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.6v-flash',
|
||||
label: 'GLM-4.6 V Flash (Free)',
|
||||
pubDate: '20251208',
|
||||
description: 'Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Vision_Reasoning,
|
||||
@@ -117,6 +124,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.6v',
|
||||
label: 'GLM-4.6 V',
|
||||
pubDate: '20251208',
|
||||
description: 'Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Vision_Reasoning,
|
||||
@@ -131,6 +139,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.6',
|
||||
label: 'GLM-4.6',
|
||||
pubDate: '20250930',
|
||||
description: 'GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -144,6 +153,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-ocr',
|
||||
label: 'GLM-OCR (Vision, OCR)',
|
||||
pubDate: '20260203',
|
||||
description: 'Specialized OCR model for text extraction from images and documents.',
|
||||
contextWindow: 131072,
|
||||
interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_HOTFIX_NoWebP],
|
||||
@@ -158,6 +168,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5v',
|
||||
label: 'GLM-4.5 V',
|
||||
pubDate: '20250811',
|
||||
description: 'Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.',
|
||||
contextWindow: 98304, // 96K
|
||||
interfaces: _IF_Vision_Reasoning,
|
||||
@@ -173,6 +184,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5-flash',
|
||||
label: 'GLM-4.5 Flash (Free)',
|
||||
pubDate: '20250728',
|
||||
description: 'Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.',
|
||||
contextWindow: 98304,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -185,6 +197,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5-airx',
|
||||
label: 'GLM-4.5 AirX',
|
||||
pubDate: '20250728',
|
||||
description: 'Extended lightweight GLM-4.5 variant. Interleaved thinking.',
|
||||
contextWindow: 98304,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -197,6 +210,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5-air',
|
||||
label: 'GLM-4.5 Air',
|
||||
pubDate: '20250728',
|
||||
description: 'Lightweight GLM-4.5 variant. Interleaved thinking.',
|
||||
contextWindow: 98304,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -209,6 +223,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5-x',
|
||||
label: 'GLM-4.5 X',
|
||||
pubDate: '20250728',
|
||||
description: 'Extended GLM-4.5 model. Interleaved thinking.',
|
||||
contextWindow: 98304,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -221,6 +236,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4.5',
|
||||
label: 'GLM-4.5',
|
||||
pubDate: '20250728',
|
||||
description: 'Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.',
|
||||
contextWindow: 98304,
|
||||
interfaces: _IF_Reasoning,
|
||||
@@ -234,6 +250,7 @@ const _knownZAIModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'glm-4-32b-0414-128k',
|
||||
label: 'GLM-4 32B (0414) 128K',
|
||||
pubDate: '20250414',
|
||||
description: 'GLM-4 32B model with 128K context, 16K output.',
|
||||
contextWindow: 131072,
|
||||
interfaces: _IF_Chat,
|
||||
|
||||
Regular → Executable
+4
-1
@@ -6,4 +6,7 @@ cd "$(dirname "$0")/../../.."
|
||||
|
||||
# Run with npx tsx (will download on-demand if needed)
|
||||
# Uses npx cache, lightweight and no local install required
|
||||
exec npx -y tsx tools/data/llms/llm-registry-sync.ts "$@"
|
||||
npx -y tsx tools/data/llms/llm-registry-sync.ts "$@"
|
||||
|
||||
# Then dump a fresh JSON snapshot next to the DB.
|
||||
exec npx -y tsx tools/data/llms/llm-registry-sync.ts --export-db tools/data/llms/llm-registry.json
|
||||
|
||||
@@ -41,6 +41,7 @@ interface CliOptions {
|
||||
discordWebhook?: string;
|
||||
notifyFilters?: string;
|
||||
validate?: boolean;
|
||||
exportDbPath?: string; // --export-db <path>: read-only DB dump (no API calls, no sync)
|
||||
}
|
||||
|
||||
interface StoredModel {
|
||||
@@ -53,6 +54,7 @@ interface StoredModel {
|
||||
deleted_at: string | null;
|
||||
created: number | null;
|
||||
updated: number | null;
|
||||
pub_date: string | null;
|
||||
context_window: number | null;
|
||||
max_completion_tokens: number | null;
|
||||
interfaces: string | null;
|
||||
@@ -90,6 +92,13 @@ function extractSimplePrice(price: any): number | null {
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Idempotent schema migration: adds a column if it doesn't already exist. Safe to call on every run. */
|
||||
function ensureColumn(db: DatabaseSync, table: string, column: string, columnDef: string): void {
|
||||
const cols = db.prepare(`PRAGMA table_info(${table})`).all() as Array<{ name: string }>;
|
||||
if (!cols.some((c) => c.name === column))
|
||||
db.exec(`ALTER TABLE ${table} ADD COLUMN ${column} ${columnDef}`);
|
||||
}
|
||||
|
||||
function initDatabase(): DatabaseSync {
|
||||
const db = new DatabaseSync(DB_PATH);
|
||||
|
||||
@@ -105,6 +114,7 @@ function initDatabase(): DatabaseSync {
|
||||
deleted_at TEXT,
|
||||
created INTEGER,
|
||||
updated INTEGER,
|
||||
pub_date TEXT,
|
||||
context_window INTEGER,
|
||||
max_completion_tokens INTEGER,
|
||||
interfaces TEXT,
|
||||
@@ -131,6 +141,9 @@ function initDatabase(): DatabaseSync {
|
||||
)
|
||||
`);
|
||||
|
||||
// Migrations for existing DBs (safe no-ops on fresh DBs that already have the column from CREATE TABLE).
|
||||
ensureColumn(db, 'models', 'pub_date', 'TEXT');
|
||||
|
||||
return db;
|
||||
}
|
||||
|
||||
@@ -157,15 +170,16 @@ function saveChanges(
|
||||
): void {
|
||||
if (changes.new.length > 0) {
|
||||
const stmt = db.prepare(`
|
||||
INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated,
|
||||
INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated, pub_date,
|
||||
context_window, max_completion_tokens, interfaces, description,
|
||||
benchmark_elo, benchmark_mmlu, price_input, price_output, original_json, deleted_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
|
||||
ON CONFLICT (id, vendor, service) DO UPDATE SET
|
||||
label = excluded.label,
|
||||
last_seen = excluded.last_seen,
|
||||
created = excluded.created,
|
||||
updated = excluded.updated,
|
||||
pub_date = excluded.pub_date,
|
||||
context_window = excluded.context_window,
|
||||
max_completion_tokens = excluded.max_completion_tokens,
|
||||
interfaces = excluded.interfaces,
|
||||
@@ -188,6 +202,7 @@ function saveChanges(
|
||||
timestamp,
|
||||
model.created ?? null,
|
||||
model.updated ?? null,
|
||||
model.pubDate ?? null,
|
||||
model.contextWindow ?? null,
|
||||
model.maxCompletionTokens ?? null,
|
||||
model.interfaces ? JSON.stringify(model.interfaces) : null,
|
||||
@@ -208,6 +223,7 @@ function saveChanges(
|
||||
last_seen = ?,
|
||||
created = ?,
|
||||
updated = ?,
|
||||
pub_date = ?,
|
||||
context_window = ?,
|
||||
max_completion_tokens = ?,
|
||||
interfaces = ?,
|
||||
@@ -229,6 +245,7 @@ function saveChanges(
|
||||
timestamp,
|
||||
model.created ?? null,
|
||||
model.updated ?? null,
|
||||
model.pubDate ?? null,
|
||||
model.contextWindow ?? null,
|
||||
model.maxCompletionTokens ?? null,
|
||||
model.interfaces ? JSON.stringify(model.interfaces) : null,
|
||||
@@ -247,11 +264,13 @@ function saveChanges(
|
||||
|
||||
if (changes.unchanged.length > 0) {
|
||||
const stmt = db.prepare(`
|
||||
INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated,
|
||||
INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated, pub_date,
|
||||
context_window, max_completion_tokens, interfaces, description,
|
||||
benchmark_elo, benchmark_mmlu, price_input, price_output, original_json, deleted_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
|
||||
ON CONFLICT (id, vendor, service) DO UPDATE SET last_seen = excluded.last_seen
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
|
||||
ON CONFLICT (id, vendor, service) DO UPDATE SET
|
||||
last_seen = excluded.last_seen,
|
||||
pub_date = excluded.pub_date
|
||||
`);
|
||||
|
||||
for (const model of changes.unchanged) {
|
||||
@@ -264,6 +283,7 @@ function saveChanges(
|
||||
timestamp,
|
||||
model.created ?? null,
|
||||
model.updated ?? null,
|
||||
model.pubDate ?? null,
|
||||
model.contextWindow ?? null,
|
||||
model.maxCompletionTokens ?? null,
|
||||
model.interfaces ? JSON.stringify(model.interfaces) : null,
|
||||
@@ -310,6 +330,114 @@ function saveSyncHistory(
|
||||
);
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Snapshot Export
|
||||
// ============================================================================
|
||||
|
||||
interface CatalogModel {
|
||||
id: string;
|
||||
vendor: string;
|
||||
service: string;
|
||||
label: string;
|
||||
pubDate: string | null;
|
||||
firstSeen: string;
|
||||
lastSeen: string;
|
||||
deletedAt: string | null;
|
||||
created: number | null;
|
||||
updated: number | null;
|
||||
contextWindow: number | null;
|
||||
maxCompletionTokens: number | null;
|
||||
interfaces: string[] | null;
|
||||
description: string | null;
|
||||
benchmarkElo: number | null;
|
||||
priceInput: number | null;
|
||||
priceOutput: number | null;
|
||||
}
|
||||
|
||||
interface CatalogSnapshot {
|
||||
schemaVersion: number;
|
||||
exportedAt: string;
|
||||
totalCount: number;
|
||||
activeCount: number;
|
||||
deletedCount: number;
|
||||
byVendor: Record<string, number>;
|
||||
models: CatalogModel[];
|
||||
}
|
||||
|
||||
/** Dump the entire registry (active + soft-deleted) to a JSON file. Read-only on the DB. */
|
||||
function exportSnapshot(db: DatabaseSync, outPath: string): void {
|
||||
const rows = db.prepare(`
|
||||
SELECT id, vendor, service, label, pub_date, first_seen, last_seen, deleted_at,
|
||||
created, updated, context_window, max_completion_tokens, interfaces, description,
|
||||
benchmark_elo, price_input, price_output
|
||||
FROM models
|
||||
ORDER BY vendor, service, id
|
||||
`).all() as unknown as Array<StoredModel & { interfaces: string | null }>;
|
||||
|
||||
const byVendor: Record<string, number> = {};
|
||||
let activeCount = 0;
|
||||
let deletedCount = 0;
|
||||
|
||||
const models: CatalogModel[] = rows.map((r) => {
|
||||
byVendor[r.vendor] = (byVendor[r.vendor] || 0) + 1;
|
||||
if (r.deleted_at) deletedCount++;
|
||||
else activeCount++;
|
||||
|
||||
let parsedInterfaces: string[] | null = null;
|
||||
if (r.interfaces) {
|
||||
try {
|
||||
const parsed = JSON.parse(r.interfaces);
|
||||
if (Array.isArray(parsed)) parsedInterfaces = parsed;
|
||||
} catch {
|
||||
// leave null on parse failure
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
id: r.id,
|
||||
vendor: r.vendor,
|
||||
service: r.service,
|
||||
label: r.label,
|
||||
pubDate: r.pub_date,
|
||||
firstSeen: r.first_seen,
|
||||
lastSeen: r.last_seen,
|
||||
deletedAt: r.deleted_at,
|
||||
created: r.created,
|
||||
updated: r.updated,
|
||||
contextWindow: r.context_window,
|
||||
maxCompletionTokens: r.max_completion_tokens,
|
||||
interfaces: parsedInterfaces,
|
||||
description: r.description,
|
||||
benchmarkElo: r.benchmark_elo,
|
||||
priceInput: r.price_input,
|
||||
priceOutput: r.price_output,
|
||||
};
|
||||
});
|
||||
|
||||
const snapshot: CatalogSnapshot = {
|
||||
schemaVersion: 1,
|
||||
exportedAt: new Date().toISOString(),
|
||||
totalCount: rows.length,
|
||||
activeCount,
|
||||
deletedCount,
|
||||
byVendor,
|
||||
models,
|
||||
};
|
||||
|
||||
// Write atomically: write to temp, then rename. Avoids partial reads if a consumer is watching.
|
||||
const dir = path.dirname(path.resolve(outPath));
|
||||
if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
|
||||
const tmpPath = `${outPath}.tmp`;
|
||||
fs.writeFileSync(tmpPath, JSON.stringify(snapshot, null, 2));
|
||||
fs.renameSync(tmpPath, outPath);
|
||||
|
||||
console.log(
|
||||
`${COLORS.green}✓ Exported${COLORS.reset} ${rows.length} models ` +
|
||||
`(${activeCount} active, ${deletedCount} deleted) ` +
|
||||
`${COLORS.dim}-> ${path.resolve(outPath)}${COLORS.reset}`,
|
||||
);
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Change Detection
|
||||
// ============================================================================
|
||||
@@ -353,6 +481,9 @@ function detectChanges(
|
||||
existingModel.context_window !== (model.contextWindow ?? null) ||
|
||||
existingModel.max_completion_tokens !== (model.maxCompletionTokens ?? null) ||
|
||||
existingModel.interfaces !== modelInterfaces;
|
||||
// NOTE: pub_date intentionally EXCLUDED from change detection. On first run after upgrade,
|
||||
// all rows go from NULL -> editorial value, which would fire ~hundreds of spurious "updated"
|
||||
// notifications. The unchanged-touch path below silently backfills pub_date instead.
|
||||
|
||||
if (hasChanged) {
|
||||
changes.updated.push(model);
|
||||
@@ -542,6 +673,10 @@ function parseArgs(): CliOptions {
|
||||
case '--validate':
|
||||
options.validate = true;
|
||||
break;
|
||||
case '--export-db':
|
||||
options.exportDbPath = nextArg;
|
||||
i++;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -566,6 +701,7 @@ ${COLORS.bright}Options:${COLORS.reset}
|
||||
--posthog-key <key> PostHog API key for analytics
|
||||
--discord-webhook <url> Discord webhook URL
|
||||
--notify-filters <list> Comma-separated vendor list (e.g., openai,anthropic)
|
||||
--export-db <path> Read-only DB dump to JSON (no API calls, no sync). Run separately from sync.
|
||||
--help Show this help
|
||||
|
||||
${COLORS.bright}Examples:${COLORS.reset}
|
||||
@@ -961,6 +1097,17 @@ async function main() {
|
||||
try {
|
||||
const options = parseArgs();
|
||||
|
||||
// --export-db: read-only DB dump. No config, no sync, no API calls.
|
||||
if (options.exportDbPath) {
|
||||
const db = initDatabase();
|
||||
try {
|
||||
exportSnapshot(db, options.exportDbPath);
|
||||
} finally {
|
||||
db.close();
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
let servicesConfig: Record<string, AixAPI_Access>;
|
||||
|
||||
if (options.config) {
|
||||
|
||||
Reference in New Issue
Block a user