Roll AIX

BlockOpUpstreamResume: full recovery. Fixes #1088
AIX: Gemini Interactions: relax
2026-05-10 21:50:14 -07:00 · 2026-05-05 04:17:39 -07:00 · 2026-05-05 04:14:00 -07:00 · 2026-05-05 03:32:13 -07:00 · 2026-05-05 03:25:27 -07:00 · 2026-05-05 03:23:35 -07:00
85 changed files with 1941 additions and 434 deletions
@@ -32,6 +32,16 @@ The `gh` command is available to interact with GitHub from the terminal, but **N
 - **Always use `git mv` instead of `mv`** when renaming or moving files - preserves git history tracking
 - **NEVER run `git stash`** - it causes work loss

+**Branch contents:**
+- `main` is the open-source build: local-first, BYO-keys, full AIX and provider coverage
+- `dev` extends `main` with the hosted/cloud layer: auth, Zync sync, Cloud Fabric, Stripe, multi-tenant, admin pages, it's the way to go for users, the best user experience of any multi-model chat application
+- Cloud/auth/sync code stays on `dev`; non-cloud improvements (UX, AIX, model support, bug fixes) can land on either branch
+
+**Branch workflow:**
+- `dev` is rebased on top of `main` (never merged) - `main` changes flow into `dev` on the next rebase, no manual forward-port needed
+- Never `git merge` between the two branches - breaks the linear topology
+- Backporting `dev` -> `main` is a re-implementation, never a cherry-pick - keep `main`-side edits minimal/additive so the existing `dev` version lands cleanly on rebase; split into small commits when natural
+
 ### Core Directory Structure

 You are started from the root of the repository (i.e. where the git folder is or scripts should be run from).
@@ -17,6 +17,13 @@ Architecture and system documentation is available in the `/kb/` knowledge base,
 #### CSF - Client-Side Fetch
 - **[CSF.md](systems/client-side-fetch.md)** - Direct browser-to-API communication for LLM requests

+#### LLM - Language Model Metadata
+- **[LLM-editorial-control.md](modules/LLM-editorial-pubdate.md)** - Where we have editorial control over per-model metadata vs dynamic discovery; `pubDate` field semantics, propagation chain, resolution rules, per-vendor matrix
+- **[LLM-models-catalog-pipeline.md](modules/LLM-models-catalog-pipeline.md)** - Forward-looking pipeline: extraction script, snapshot artifact, website consumption, future schema extensions
+
+#### LLM - Vendor APIs
+- **[LLM-gemini-interactions.md](modules/LLM-gemini-interactions.md)** - Gemini Interactions API (Deep Research): endpoints, status taxonomy, two retrieval paths (SSE replay vs JSON GET), known failure modes (10-min cuts, zombies), UI surface
+
 ### Systems Documentation

 #### Core Platform Systems
@@ -0,0 +1,106 @@
+ # LLM Editorial Control Surface
+
+This document maps where Big-AGI has editorial control over per-model metadata (and therefore can guarantee fields like `pubDate`, curated `description`, `chatPrice`, `benchmark`, `parameterSpecs`, etc.) versus where it must rely on the vendor API's dynamic discovery (and therefore cannot guarantee them).
+
+For the forward-looking pipeline (extraction script, snapshot, website consumption, future schema extensions), see [LLM-models-catalog-pipeline.md](LLM-models-catalog-pipeline.md).
+
+
+## The `pubDate` field
+
+`pubDate?: string` (validated as `/^\d{8}$/`, e.g. `'20250929'`) is **optional** in the wire schema and on `DLLM`. It was added to:
+
+- `ModelDescription_schema` in `src/modules/llms/server/llm.server.types.ts` - the canonical wire type
+- `OrtVendorLookupResult` in the same file - so OpenRouter inherits it via `llmOrt*Lookup`
+- `DLLM` in `src/common/stores/llms/llms.types.ts` - the persisted client model
+
+### Where `pubDate` is guaranteed (always emitted)
+
+- **Editorial entries** in 12 hybrid/editorial vendors (282 models). Hand-curated, externally corroborated. Future entries in these arrays are expected to include `pubDate`.
+- **Anthropic 0-day placeholder** (`llmsAntCreatePlaceholderModel`): when the API surfaces an Anthropic model not in the editorial list, the placeholder uses the API's `created_at` ISO date, falling back to today via `formatPubDate()`.
+- **Gemini 0-day fallback** (`geminiModelToModelDescription`): when the API returns a Gemini model not in `_knownGeminiModels`, the converter falls back to today via `formatPubDate()` (Gemini API does not expose a creation timestamp).
+
+### Where `pubDate` is omitted (optional)
+
+- **Symlink entries** (`KnownLink`) - inherit the target's `pubDate` via the merge logic in `fromManualMapping`.
+- **Unknown variants resolved through `super`/`fallback`** in `fromManualMapping` for non-Anthropic/non-Gemini vendors - the field is left undefined rather than fabricated.
+- **Dynamic-only vendors** (OpenRouter, TogetherAI, Novita, ChutesAI, FireworksAI, TLUS, Azure, LM Studio, LocalAI, FastAPI, ArceeAI, LLMAPI) - no editorial knob; pubDate flows in only when the underlying lookup or upstream API populates it.
+
+The rationale: today's date is a defensible 0-day proxy only when we know we're seeing a brand-new model the vendor just announced (Anthropic and Gemini's "discovery via official model list" paths). For arbitrary dynamic vendors, fabricating today would mark old/well-known models as new - misleading. Better to omit.
+
+### Propagation chain
+
+- `fromManualMapping()` in `src/modules/llms/server/models.mappings.ts` - copies the field for OAI-style vendors when present
+- `geminiModelToModelDescription()` in `src/modules/llms/server/gemini/gemini.models.ts` - copies for Gemini, falls back to today for unknowns
+- `llmsAntCreatePlaceholderModel()` in `src/modules/llms/server/anthropic/anthropic.models.ts` - emits from API `created_at` (or today)
+- `_mergeLookup()` in `src/modules/llms/server/openai/models/openrouter.models.ts` - merges for OpenRouter cross-vendor inheritance
+- `_createDLLMFromModelDescription()` in `src/modules/llms/llm.client.ts` - copies onto the persisted DLLM when present
+- `formatPubDate()` helper in `src/modules/llms/server/models.mappings.ts` - shared `'YYYYMMDD'` formatter for the 0-day-fillable paths
+
+### Semantics
+
+`pubDate` is the **earliest public availability** of the model - the date on which the vendor first made this specific model usable by external users via any channel (consumer app, web, console, API, partner, open-weights upload).
+
+It is **not**:
+
+- The date Big-AGI added the entry to its catalog (Ollama uses `added` for that)
+- The training-data cutoff (proposed but not implemented; see `src/common/stores/llms/llms.types.next.ts:217`)
+- The date the model snapshot was built (suffixes like `-1212` may refer to build dates, but `pubDate` tracks public availability)
+
+### Resolution rules (when sources conflict)
+
+1. **Date-suffixed model IDs**: when the suffix matches a documented announcement, the suffix is canonical (vendor convention). xAI, OpenAI, and Mistral all use suffixes that closely track release dates.
+2. **Anthropic exception**: Anthropic's date suffixes are typically the **snapshot/training-cutoff date, not the public release date**. For example, `claude-3-7-sonnet-20250219` was released on 2025-02-24, `claude-opus-4-20250514` was released 2025-05-22, and `claude-haiku-4-5-20251001` was released 2025-10-15. Always corroborate against Anthropic's blog/press for the actual release date. Only `claude-sonnet-4-5-20250929` and `claude-opus-4-1-20250805` have suffixes that match.
+3. **Closed beta -> public beta -> GA**: use the first date *external* users could access the specific variant.
+4. **Family-headline IDs and dated snapshots** (e.g., `claude-opus-4-1` and `claude-opus-4-1-20250805`): typically share a release date.
+5. **Hosted on a third party** (Groq hosting Llama, OpenPipe mirroring others, OpenRouter aggregating): use the *underlying* model's original release date by its creator, not when the host added it.
+6. **Symlinks** (entries with `symLink:`): inherit the target's date.
+7. **Partial dates** (only month known): use the 1st of the month and tag as MEDIUM confidence in the editor's note.
+
+
+## Editorial control matrix
+
+Three categories:
+
+- **Editorial** - the vendor file contains hand-curated entries; we control descriptions, pricing, benchmarks, interfaces, parameter specs, and `pubDate`.
+- **Hybrid** - the API returns the live model list, and editorial entries (keyed by id/idPrefix) merge over the API data via `fromManualMapping`. We control everything except *which models exist*.
+- **Dynamic** - the API is the only source of model identity and metadata. Big-AGI cannot reliably populate `pubDate` here (no editorial knob).
+
+| Vendor | Category | File | Array | Entries | `pubDate` populated |
+|---|---|---|---|---|---|
+| Anthropic | Hybrid | `anthropic/anthropic.models.ts` | `hardcodedAnthropicModels` | 12 | 12/12 HIGH |
+| Gemini | Hybrid | `gemini/gemini.models.ts` | `_knownGeminiModels` | 33 | 33/33 HIGH |
+| OpenAI | Hybrid | `openai/models/openai.models.ts` | `_knownOpenAIChatModels` | 96 | 95/96 HIGH/MED (`osb-120b` skipped, speculative) |
+| xAI | Hybrid | `openai/models/xai.models.ts` | `_knownXAIChatModels` | 13 | 13/13 HIGH (pilot) |
+| Mistral | Hybrid | `openai/models/mistral.models.ts` | `_knownMistralModelDetails` | 41 | 41/41 (40 HIGH, 1 MED for legacy `mistral-medium`) |
+| Moonshot (Kimi) | Hybrid | `openai/models/moonshot.models.ts` | `_knownMoonshotModels` | 13 | 13/13 (10 HIGH, 3 MED for v1 base models) |
+| Perplexity | Editorial | `openai/models/perplexity.models.ts` | `_knownPerplexityChatModels` | 4 | 4/4 HIGH |
+| MiniMax | Editorial | `openai/models/minimax.models.ts` | `_knownMiniMaxModels` | 10 | 10/10 HIGH |
+| DeepSeek | Hybrid | `openai/models/deepseek.models.ts` | `_knownDeepseekChatModels` | 4 | 4/4 HIGH |
+| Groq | Hybrid (host) | `openai/models/groq.models.ts` | `_knownGroqModels` | 11 | 11/11 HIGH (underlying-model date) |
+| Z.AI / GLM | Hybrid | `openai/models/zai.models.ts` | `_knownZAIModels` | 17 | 16/17 (`glm-5-code` UNCONFIRMED) |
+| OpenPipe | Editorial (mirror) | `openai/models/openpipe.models.ts` | `_knownOpenPipeChatModels` | 30 | 30/30 HIGH (all upstream-mirror, no OpenPipe originals) |
+| Bedrock | Reuses Anthropic | `bedrock/bedrock.models.ts` | -> `hardcodedAnthropicModels` | (12) | inherited |
+| Ollama | Editorial (catalog) | `ollama/ollama.models.ts` | `OLLAMA_BASE_MODELS` | 209 | **deferred** - see notes |
+| Arcee AI | Dynamic | `openai/models/arceeai.models.ts` | `_arceeKnownModels` | 0 | n/a (empty) |
+| LLMAPI | Dynamic | `openai/models/llmapi.models.ts` | `_llmapiKnownModels` | 0 | n/a (empty) |
+| Alibaba | Dynamic | `openai/models/alibaba.models.ts` | `_knownAlibabaChatModels` | 0 | n/a (empty) |
+| OpenRouter | Dynamic + delegated lookup | `openai/models/openrouter.models.ts` | (parser) | -- | inherited via `llmOrt*Lookup` |
+| TogetherAI | Dynamic | `openai/models/together.models.ts` | (parser) | -- | no |
+| FireworksAI | Dynamic | `openai/models/fireworksai.models.ts` | (parser) | -- | no |
+| Novita | Dynamic | `openai/models/novita.models.ts` | (parser) | -- | no |
+| ChutesAI | Dynamic | `openai/models/chutesai.models.ts` | (parser) | -- | no |
+| TLUS | Dynamic | `openai/models/tlusapi.models.ts` | (parser) | -- | no |
+| Azure | Dynamic | `openai/models/azure.models.ts` | (parser) | -- | no |
+| LM Studio | Dynamic | `openai/models/lmstudio.models.ts` | (parser) | -- | no |
+| LocalAI | Dynamic | `openai/models/localai.models.ts` | (parser) | -- | no |
+| FastAPI | Dynamic | `openai/models/fastapi.models.ts` | (parser) | -- | no |
+
+**Totals**: 284 editorial entries across 12 vendors, of which **282** have corroborated `pubDate` and **2** are intentional gaps (`osb-120b` speculative, `glm-5-code` not yet announced). All 12 vendor files type-check clean.
+
+### Notes
+
+- **Hybrid** vendors are still effectively editorial for the models we know about: when an API id matches a hardcoded `idPrefix` (or `id`), `fromManualMapping` injects all the editorial fields. Unknown ids fall through to a default-shaped placeholder where `pubDate` is undefined.
+- **OpenRouter** delegates back to Anthropic / Gemini / OpenAI editorial lookups via `llmOrtAntLookup_ThinkingVariants`, `llmOrtGemLookup`, `llmOrtOaiLookup`. `pubDate` flows through these lookups, so OpenRouter-served Claude/Gemini/GPT models get `pubDate` automatically once the underlying editorial entry has it.
+- **Bedrock** finds Anthropic editorial via `llmBedrockFindAnthropicModel` and strips unsupported interfaces - `pubDate` inherits from Anthropic.
+- **Ollama** is deferred: 209 entries keyed by upstream model family (e.g. `qwen3.6`, `kimi-k2`, `glm-4.6`). Each entry's `pubDate` would need to be the upstream creator's release date (Meta, Alibaba, Moonshot, Z.AI, etc.). This is large-scale upstream research; better handled in a follow-up pass once cross-vendor `pubDate` data is consolidated and reusable.
+- **Dynamic-only** vendors get nothing automatic. To add `pubDate` for them we'd have to seed editorial entries (which is what `fromManualMapping`'s mapping mechanism was built for); this is a per-vendor decision and out of scope for the initial rollout.
@@ -0,0 +1,88 @@
+# Gemini Interactions API
+
+The Interactions API powers Gemini's agent runs (Deep Research today, more agent types planned). This doc is the source of truth for protocol shape, failure modes, and the recovery model — code comments link here instead of repeating the rationale.
+
+## References
+
+- **GH [#1088](https://github.com/enricoros/big-AGI/issues/1088)** — Auto-resume for Deep Research; Recover button
+- **GH [#1095](https://github.com/enricoros/big-AGI/issues/1095)** — Visualizations toggle (`agent_config.visualization`)
+- **Google forum [143098](https://discuss.ai.google.dev/t/interactions-api-connection-breaks-at-the-10-minutes-mark/143098)** — 10-min SSE cut
+- **Google forum [143099](https://discuss.ai.google.dev/t/streaming-resume-broken-on-interactions-api-deep-research-often-cannot-resume/143099)** — Streaming resume re-cuts
+- **Upstream specs** — `_upstream/gemini.interactions.spec.md`, `gemini.interactions.guide.md`, `gemini.deep-research.guide.md`
+
+## Endpoints
+
+| Verb   | Path                                      | Purpose                                                           |
+|--------|-------------------------------------------|-------------------------------------------------------------------|
+| POST   | `/v1beta/interactions`                    | Start a run. We always send `stream:true, background:true, store:true` |
+| GET    | `/v1beta/interactions/{id}?stream=true`   | Reattach via SSE replay (full event sequence from start)          |
+| GET    | `/v1beta/interactions/{id}`               | Fetch the resource as JSON (one-shot)                             |
+| POST   | `/v1beta/interactions/{id}/cancel`        | Stop a background run                                             |
+| DELETE | `/v1beta/interactions/{id}`               | Remove the stored record (does NOT cancel an in-flight run)       |
+
+Retention: 1 day free, 55 days paid.
+
+## Status taxonomy
+
+| Status            | Meaning                                       | Handling                                              |
+|-------------------|-----------------------------------------------|-------------------------------------------------------|
+| `in_progress`     | Live run **or** zombie (see C)                | Surface diagnostics; offer Resume/Recover/Stop        |
+| `completed`       | Done with content in `outputs[]`              | Emit fragments, `tokenStopReason='ok'`                |
+| `failed`          | Server-side failure                           | Terminating issue                                     |
+| `cancelled`       | We or another client cancelled                | Close as `cg-issue`                                   |
+| `incomplete`      | Stopped early (token limit) — partial outputs | Note + `tokenStopReason='out-of-tokens'`              |
+| `requires_action` | Not expected for Deep Research                | Fail loudly so we notice                              |
+
+## Two retrieval paths
+
+| Path                  | Endpoint                          | Parser                                    | Use case                          |
+|-----------------------|-----------------------------------|-------------------------------------------|-----------------------------------|
+| SSE replay            | `GET ?stream=true`                | `createGeminiInteractionsParserSSE`       | Canonical resume; live deltas     |
+| JSON GET (recovery)   | `GET` (no `stream`)               | `createGeminiInteractionsParserNS`        | Recover when SSE is broken        |
+
+Both replay from the start — `ContentReassembler` REPLACES content on reattach, so partial replay (`last_event_id`) is intentionally NOT used. The NS parser walks `outputs[]` (thoughts, text, images, audio) and emits the same particles the SSE parser would, in one batch.
+
+## Failure modes
+
+### A. 10-minute SSE cut (forum 143098)
+
+The SSE connection gets cut at exactly 600 s, regardless of activity. The cut is malformed (JSON error array instead of clean SSE close) and we treat it as stream-closed-early. The run typically **continues** server-side and reaches `completed`. **Recover (JSON GET)** retrieves the full report.
+
+### B. Streaming resume re-cuts (forum 143099)
+
+A fresh SSE replay can re-cut at the same 10-minute boundary on long runs, so Resume alone never reaches `interaction.complete`. **Recover** is the fallback.
+
+### C. Zombie interactions (#1088)
+
+Resource sits in `status: in_progress` for **days** with `outputs: []` — the generator crashed but the status never transitioned. **Not recoverable** (no data was ever produced). The NS parser surfaces `created`, `updated`, output count, and a "stuck for over an hour" hint so the user can decide to delete and retry.
+
+### D. Connection drop mid-run
+
+Network blip; resource is fine. **Resume (SSE replay)** picks up cleanly.
+
+## UI
+
+`BlockOpUpstreamResume` renders up to three buttons:
+
+| Button   | Action                            | Shown when                                              |
+|----------|-----------------------------------|---------------------------------------------------------|
+| Resume   | SSE replay                        | `onResume` provided                                     |
+| Recover  | JSON GET (one-shot)               | `upstreamHandle.uht` ∈ `_NS_RECOVER_UHTS`               |
+| Stop     | Cancel + delete upstream resource | `onDelete` provided                                     |
+
+The Recover gate is an inline `uht === 'vnd.gem.interactions'` check in `BlockOpUpstreamResume.tsx` — extend when another vendor needs the same fallback. Stop is intentionally NOT gated by Resume/Recover busy state — it's the escape hatch for hung resumes.
+
+## Visualization control (#1095)
+
+Deep Research accepts `agent_config.visualization: 'auto' | 'off'`. Exposed as `llmVndGeminiAgentViz` (label "Visualizations"). Forwarded only when explicitly `'off'` so the upstream `'auto'` default stays untouched. Useful when merging multiple reports — image fragments break Beam fusion.
+
+## Code map
+
+| File                                                                                 | Role                                                  |
+|--------------------------------------------------------------------------------------|-------------------------------------------------------|
+| `aix/server/dispatch/wiretypes/gemini.interactions.wiretypes.ts`                     | Zod schemas (RequestBody, Interaction, StreamEvent)   |
+| `aix/server/dispatch/chatGenerate/adapters/gemini.interactionsCreate.ts`             | POST body (input + agent_config)                      |
+| `aix/server/dispatch/chatGenerate/parsers/gemini.interactions.parser.ts`             | SSE parser + NS parser                                |
+| `aix/server/dispatch/chatGenerate/chatGenerate.dispatch.ts` (`gemini` case)          | Resume dispatch: SSE vs JSON branch                   |
+| `apps/chat/components/message/BlockOpUpstreamResume.tsx`                             | Resume / Recover / Stop UI                            |
+| `apps/chat/components/ChatMessageList.tsx` (`handleMessageUpstreamResume`)           | Wires click handler to `aixReattachContent_DMessage_orThrow` |
@@ -0,0 +1,78 @@
+# LLM Models Catalog Pipeline (forward-looking)
+
+Status: **proposal / partially implemented**. Companion to [LLM-editorial-control.md](LLM-editorial-pubdate.md) which describes the durable reference (`pubDate` semantics, editorial-vs-dynamic matrix, propagation chain).
+
+This document captures the forward-looking pipeline that turns Big-AGI's editorial model metadata into website value-add (plots, decision helpers, comparison tools at big-agi.com).
+
+
+## Goal
+
+Stand up a database/datastore that the website (`~/dev/website`) can query for plots, decision helpers, and comparison tools - without requiring the website to call our authenticated tRPC endpoints.
+
+
+## Stages
+
+### Stage 1: source of truth (in this repo) — DONE
+
+Editorial files in `src/modules/llms/server/` remain the canonical source for:
+
+- Identity: id, label, vendor
+- Capabilities: `interfaces`, `parameterSpecs`, `contextWindow`, `maxCompletionTokens`
+- Pricing: `chatPrice` (input / output / cache tiers)
+- Benchmarks: `benchmark.cbaElo` (Chat Bot Arena ELO)
+- Lifecycle: `pubDate`, `isLegacy`, `isPreview`, `hidden`, deprecation comments
+
+Well-typed, version-controlled, reviewed - every model edit is a code change with diff history. 282 entries currently carry `pubDate` (see editorial-control matrix).
+
+### Stage 2: extraction script — IN PROGRESS
+
+A build-time script (e.g. `scripts/llms/export-models.ts`) that:
+
+1. Loads every editorial vendor's model array.
+2. Normalizes per-vendor shapes (array vs Record, `id` vs `idPrefix`, `KnownLink` symlinks) to a single row format.
+3. Resolves symlinks (target's `pubDate` flows through).
+4. Writes a single JSON snapshot: `data/models-catalog.json` (one row per model, with vendor + the editorial fields above).
+
+Open question: do we want this committed (gives the website a stable artifact / public URL) or built on-demand in CI? **Recommend committed snapshot** under `data/` so consumers get a stable URL.
+
+### Stage 3: enrichment — NOT STARTED
+
+The exported snapshot gets enriched with data we don't currently track in editorial files:
+
+- **Knowledge cutoff** (proposed in `llms.types.next.ts:217` but never implemented; should be added to `ModelDescription_schema` as a follow-up).
+- **MMLU / HumanEval / SWE-bench / GPQA / MATH** scores (currently only `cbaElo`; richer benchmarks belong in a separate block).
+- **Throughput / latency** numbers (per-vendor, possibly per-region).
+- **Modalities matrix** (input image, input audio, input video, input PDF, output image, output audio).
+- **Weights availability** (closed / open / restricted), license.
+
+Sources for enrichment: HuggingFace cards, vendor docs, Artificial Analysis, LLM-Stats, official benchmarks. Some can be scraped on a cadence; some needs editorial review.
+
+### Stage 4: website consumption — NOT STARTED
+
+The website (`~/dev/website`) consumes the snapshot to render:
+
+- **Timeline plot**: `pubDate` (x-axis) vs `cbaElo` (y-axis), grouped by vendor - shows the frontier and rate of progress.
+- **Cost-per-quality plot**: `chatPrice.output` vs `cbaElo` - "best model per dollar".
+- **Decision helpers**: filter by capability (`interfaces`), context window, pricing tier, vendor.
+- **Comparison cards**: side-by-side specs.
+- **Lifecycle alerts**: deprecation warnings for retiring models.
+
+
+## Open questions
+
+1. **Where does enrichment data live?** A separate `data/models-enrichment.json` (joined by id at build time) keeps editorial files clean but introduces a join surface. Alternative: extend `ModelDescription_schema` with optional enrichment fields and treat editorial files as the only source. Recommend the separate file approach - editorial files stay focused on vendor-API integration; enrichment evolves on a different cadence.
+2. **How fresh does the website need to be?** If daily, build the snapshot in CI on push and publish to a static URL. If real-time, consume tRPC directly - more work but fewer freshness gaps.
+3. **Do we expose `pubDate` and other editorial metadata via tRPC publicly, or only via the snapshot?** The current tRPC routes require auth; the website should consume the snapshot, not live tRPC.
+4. **Schema versioning** - if `ModelDescription_schema` evolves, the snapshot consumers need to be tolerant. Include a `schemaVersion` field in the snapshot envelope.
+
+
+## Future extensions to `ModelDescription_schema`
+
+Beyond `pubDate`, the natural follow-ups (in priority order):
+
+1. **`knowledgeCutoff?: string`** (`'YYYY-MM'` or `'YYYY-MM-DD'`) - already proposed in `llms.types.next.ts`. Useful for the timeline plot and for context-aware prompts.
+2. **`deprecationDate?: string`** - currently exists informally as `deprecated?: string` on `_knownGeminiModels`; should be promoted to the schema.
+3. **`license?: string`** - especially important for open-weights models (apache-2.0, mit, llama-community, custom).
+4. **`weights?: 'closed' | 'open' | 'restricted'`** - quick filter for "can I run this myself?".
+5. **`benchmarks?: { mmlu?: number, humaneval?: number, gpqa?: number, ... }`** - richer than the current `cbaElo`-only block.
+6. **`modalities?: { in: string[], out: string[] }`** - more precise than `interfaces` for input/output capability matrices.
@@ -583,9 +583,11 @@ export function AppChat() {
  }, []);

  useGlobalShortcuts('AppChat', React.useMemo(() => [
-    // focused conversation
-    { key: 'z', ctrl: true, shift: true, disabled: isFocusedChatEmpty, action: handleMessageRegenerateLastInFocusedPane, description: 'Retry' },
-    { key: 'b', ctrl: true, shift: true, disabled: isFocusedChatEmpty, action: handleMessageBeamLastInFocusedPane, description: 'Beam Edit' },
+    // focused conversation (excluded when Beam is open so the keystroke passes through to the browser)
+    ...(beamOpenStoreInFocusedPane ? [] : [
+      { key: 'z', ctrl: true, shift: true, disabled: isFocusedChatEmpty, action: handleMessageRegenerateLastInFocusedPane, description: 'Retry' },
+      { key: 'b', ctrl: true, shift: true, disabled: isFocusedChatEmpty, action: handleMessageBeamLastInFocusedPane, description: 'Beam Edit' },
+    ]),
    { key: 'o', ctrl: true, action: handleConversationsImportFormFilePicker },
    { key: 's', ctrl: true, action: () => handleFileSaveConversation(focusedPaneConversationId) },
    { key: 'n', ctrl: true, shift: true, action: () => handleConversationNewInFocusedPane(false, false) },
@@ -603,7 +605,7 @@ export function AppChat() {
    { key: 'p', ctrl: true, action: () => personaDropdownRef.current?.openListbox() /*, description: 'Open Persona Dropdown'*/ },
    // focused conversation llm
    { key: 'o', ctrl: true, shift: true, action: handleOpenChatLlmOptions },
-  ], [focusedPaneConversationId, handleConversationNewInFocusedPane, handleConversationReset, handleConversationsImportFormFilePicker, handleDeleteConversations, handleFileSaveConversation, handleMessageBeamLastInFocusedPane, handleMessageRegenerateLastInFocusedPane, handleMoveFocus, handleNavigateHistoryInFocusedPane, handleOpenChatLlmOptions, isFocusedChatEmpty]));
+  ], [beamOpenStoreInFocusedPane, focusedPaneConversationId, handleConversationNewInFocusedPane, handleConversationReset, handleConversationsImportFormFilePicker, handleDeleteConversations, handleFileSaveConversation, handleMessageBeamLastInFocusedPane, handleMessageRegenerateLastInFocusedPane, handleMoveFocus, handleNavigateHistoryInFocusedPane, handleOpenChatLlmOptions, isFocusedChatEmpty]));


  return <>
@@ -6,6 +6,7 @@ import { Box, List } from '@mui/joy';

 import type { SystemPurposeExample } from '../../../data';

+import type { AixReattachMode } from '~/modules/aix/client/aix.client';
 import type { DiagramConfig } from '~/modules/aifn/digrams/DiagramsModal';
 import { speakText } from '~/modules/speex/speex.client';

@@ -123,7 +124,16 @@ export function ChatMessageList(props: {
    }
  }, [conversationHandler, conversationId, onConversationExecuteHistory]);

-  const handleMessageUpstreamResume = React.useCallback(async (generator: DMessageGenerator, messageId: DMessageId) => {
+
+  // Resume in-flight tracking - lives at this level (NOT inside BlockOpUpstreamResume) so it
+  // survives any remount of the message bubble during a long-running stream (e.g. Deep Research).
+  // - `resumeInFlight` (state) drives the loading/Detach UI on BlockOpUpstreamResume via props.
+  // - `resumeAbortersRef` (ref) holds the AbortController so Detach can abort even after a remount.
+  // Map keyed by messageId so multiple messages could in principle resume concurrently.
+  const [resumeInFlight, setResumeInFlight] = React.useState<Record<DMessageId, AixReattachMode>>({});
+  const resumeAbortersRef = React.useRef<Map<DMessageId, AbortController>>(new Map());
+
+  const handleMessageUpstreamResume = React.useCallback(async (generator: DMessageGenerator, messageId: DMessageId, mode: AixReattachMode) => {
    if (!conversationId || !conversationHandler) return;
    if (!generator.upstreamHandle) throw new Error('No upstream handle on generator');

@@ -131,20 +141,36 @@ export function ChatMessageList(props: {
    const llmId = generator.mgt === 'aix' ? generator.aix.mId : undefined;
    if (!llmId) throw new Error('No model id on generator');

+    const controller = new AbortController();
+    resumeAbortersRef.current.set(messageId, controller);
+    setResumeInFlight(prev => ({ ...prev, [messageId]: mode }));
+
    const { aixCreateChatGenerateContext, aixReattachContent_DMessage_orThrow } = await import('~/modules/aix/client/aix.client');
-    const result = await aixReattachContent_DMessage_orThrow(
-      llmId,
-      generator,
-      aixCreateChatGenerateContext('conversation', conversationId),
-      { abortSignal: 'NON_ABORTABLE', throttleParallelThreads: 0 },
-      async (update, isDone) => {
-        conversationHandler.messageEdit(messageId, {
-          fragments: update.fragments,
-          generator: update.generator,
-          pendingIncomplete: update.pendingIncomplete,
-        }, isDone, isDone); // remove the pending state and updte only when done
-      },
-    );
+    try {
+      await aixReattachContent_DMessage_orThrow(
+        llmId,
+        generator,
+        aixCreateChatGenerateContext('conversation', conversationId),
+        mode,
+        { abortSignal: controller.signal, throttleParallelThreads: 0 }, // Detach: aborting kills the local fetch; upstream run keeps going.
+        async (update, isDone) => {
+          conversationHandler.messageEdit(messageId, {
+            fragments: update.fragments,
+            generator: update.generator,
+            pendingIncomplete: update.pendingIncomplete,
+          }, isDone, isDone); // remove the pending state and update only when done
+        },
+      );
+    } finally {
+      // Clear local tracking only if this attempt is still the current one (avoid races on rapid retry)
+      if (resumeAbortersRef.current.get(messageId) === controller)
+        resumeAbortersRef.current.delete(messageId);
+      setResumeInFlight(prev => {
+        if (prev[messageId] !== mode) return prev;
+        const { [messageId]: _, ...rest } = prev;
+        return rest;
+      });
+    }

    // Manual reattach is one-shot: on failure (e.g. upstream 404 from expired or already-consumed handle),
    // drop the upstreamHandle so the Resume button doesn't keep luring the user into the same error.
@@ -156,6 +182,11 @@ export function ChatMessageList(props: {
    //   }, false /* messageComplete */, true /* touch */);
  }, [conversationHandler, conversationId]);

+  const handleMessageUpstreamDetach = React.useCallback((messageId: DMessageId) => {
+    resumeAbortersRef.current.get(messageId)?.abort();
+  }, []);
+
+
  const handleMessageUpstreamDelete = React.useCallback(async (generator: DMessageGenerator, messageId: DMessageId) => {
    if (!conversationId || !conversationHandler) return;
    if (!generator.upstreamHandle) throw new Error('No upstream handle on generator');
@@ -395,7 +426,11 @@ export function ChatMessageList(props: {

      {filteredMessages.map((message, idx) => {

-          // Optimization: only memo complete components, or we'd be memoizing garbage
+          // Optimization: only memo complete components, or we'd be memoizing garbage (fragments
+          // change every chunk during streaming, so the equality check would always fail).
+          // CAVEAT: switching between memo and non-memo at the same position causes React to
+          // remount the subtree (different component types). Any state that must survive that
+          // boundary lives on this component (e.g. resumeInFlight, resumeAbortersRef).
          const ChatMessageMemoOrNot = !message.pendingIncomplete ? ChatMessageMemo : ChatMessage;

          return props.isMessageSelectionMode ? (
@@ -427,7 +462,9 @@ export function ChatMessageList(props: {
              onMessageBranch={handleMessageBranch}
              onMessageContinue={handleMessageContinue}
              onMessageUpstreamResume={handleMessageUpstreamResume}
+              onMessageUpstreamDetach={handleMessageUpstreamDetach}
              onMessageUpstreamDelete={handleMessageUpstreamDelete}
+              upstreamResumeMode={resumeInFlight[message.id]}
              onMessageDelete={handleMessageDelete}
              onMessageFragmentAppend={handleMessageAppendFragment}
              onMessageFragmentDelete={handleMessageDeleteFragment}
@@ -33,7 +33,10 @@ const _styles = {
    } as const,
    '& nav > ol > li:first-of-type': {
      overflow: 'hidden',
-      maxWidth: { xs: '110px', md: '140px' },
+      // allow the chat title to use available space, shrinking gracefully when the bar is narrow
+      // NOTE: already performed by virtue of the breadcrumb having agi-ellipsize on the crumbs
+      // flexShrink: 1,
+      // minWidth: '60px',
    } as const,

  } as const,
@@ -15,6 +15,7 @@ import { KeyStroke } from '~/common/components/KeyStroke';
 import { OptimaBarControlMethods, OptimaBarDropdownMemo, OptimaDropdownItems } from '~/common/layout/optima/bar/OptimaBarDropdown';
 import { findModelsServiceOrNull } from '~/common/stores/llms/store-llms';
 import { isDeepEqual } from '~/common/util/hooks/useDeep';
+import { sortLLMsByServiceLabel } from '~/common/stores/llms/components/llms.dropdown.utils';
 import { optimaActions, optimaOpenModels } from '~/common/layout/optima/useOptima';
 import { useAllLLMs } from '~/common/stores/llms/hooks/useAllLLMs';
 import { useModelDomain } from '~/common/stores/llms/hooks/useModelDomain';
@@ -72,7 +73,10 @@ function LLMDropdown(props: {
      return lcFilterString ? true : isLLMVisible(llm);
    });

-    for (const llm of filteredLLMs) {
+    // sort by service label so vendor groups appear alphabetically (groups remain contiguous because sort is stable on equal keys)
+    const sortedLLMs = sortLLMsByServiceLabel(filteredLLMs);
+
+    for (const llm of sortedLLMs) {
      // add separators when changing services
      if (!prevServiceId || llm.sId !== prevServiceId) {
        const vendor = findModelVendor(llm.vId);
@@ -16,6 +16,7 @@ import MoreVertIcon from '@mui/icons-material/MoreVert';
 import StarOutlineRoundedIcon from '@mui/icons-material/StarOutlineRounded';

 import type { DConversationId } from '~/common/stores/chat/chat.conversation';
+import { ChatBeamIcon } from '~/common/components/icons/ChatBeamIcon';
 import { CloseablePopup } from '~/common/components/CloseablePopup';
 import { DFolder, useFolderStore } from '~/common/stores/folders/store-chat-folders';
 import { DebouncedInputMemo } from '~/common/components/DebouncedInput';
@@ -89,6 +90,7 @@ function ChatDrawer(props: {
  // external state
  const {
    clearFilters,
+    filterHasBeamOpen, toggleFilterHasBeamOpen,
    filterHasDocFragments, toggleFilterHasDocFragments,
    filterHasImageAssets, toggleFilterHasImageAssets,
    filterHasStars, toggleFilterHasStars,
@@ -98,7 +100,7 @@ function ChatDrawer(props: {
  } = useChatDrawerFilters();
  const { activeFolder, allFolders, enableFolders, toggleEnableFolders } = useFolders(props.activeFolderId);
  const { filteredChatsCount, filteredChatIDs, filteredChatsAreEmpty, filteredChatsBarBasis, filteredChatsIncludeActive, renderNavItems } = useChatDrawerRenderItems(
-    props.activeConversationId, props.chatPanesConversationIds, debouncedSearchQuery, activeFolder, allFolders, filterHasStars, filterHasImageAssets, filterHasDocFragments, filterIsArchived, navGrouping, searchSorting, showRelativeSize, searchDepth,
+    props.activeConversationId, props.chatPanesConversationIds, debouncedSearchQuery, activeFolder, allFolders, filterHasBeamOpen, filterHasStars, filterHasImageAssets, filterHasDocFragments, filterIsArchived, navGrouping, searchSorting, showRelativeSize, searchDepth,
  );
  const [uiComplexityMode, contentScaling] = useUIPreferencesStore(useShallow((state) => [state.complexityMode, state.contentScaling]));
  const zenMode = uiComplexityMode === 'minimal';
@@ -240,6 +242,10 @@ function ChatDrawer(props: {
            <ListItemDecorator>{filterHasDocFragments && <CheckRoundedIcon />}</ListItemDecorator>
            Has Attachments <AttachFileRoundedIcon />
          </MenuItem>
+          <MenuItem onClick={toggleFilterHasBeamOpen}>
+            <ListItemDecorator>{filterHasBeamOpen && <CheckRoundedIcon />}</ListItemDecorator>
+            Beam Open <ChatBeamIcon />
+          </MenuItem>

          <ListDivider />
          <ListItem>
@@ -288,8 +294,8 @@ function ChatDrawer(props: {
      )}
    </Dropdown>
  ), [
-    filterHasDocFragments, filterHasImageAssets, filterHasStars, isSearching, navGrouping, searchSorting, searchDepth, filterIsArchived, showPersonaIcons, showRelativeSize,
-    toggleFilterHasDocFragments, toggleFilterHasImageAssets, toggleFilterHasStars, toggleFilterIsArchived, toggleShowPersonaIcons, toggleShowRelativeSize,
+    filterHasBeamOpen, filterHasDocFragments, filterHasImageAssets, filterHasStars, isSearching, navGrouping, searchSorting, searchDepth, filterIsArchived, showPersonaIcons, showRelativeSize,
+    toggleFilterHasBeamOpen, toggleFilterHasDocFragments, toggleFilterHasImageAssets, toggleFilterHasStars, toggleFilterIsArchived, toggleShowPersonaIcons, toggleShowRelativeSize,
  ]);

  const displayNavItems = React.useMemo(() => {
@@ -304,6 +310,18 @@ function ChatDrawer(props: {
    return activeItem ? [...sliced, activeItem] : sliced;
  }, [renderNavItems, renderLimit, props.activeConversationId]);

+
+  // when filters/search transition from active to inactive, the active chat may end up
+  // submerged below the fold of a much longer list - scroll it back into view
+  const chatsListRef = React.useRef<HTMLDivElement>(null);
+  const isFiltering = isSearching || filterHasBeamOpen || filterHasDocFragments || filterHasImageAssets || filterHasStars || filterIsArchived;
+  React.useLayoutEffect(() => {
+    if (isFiltering) return;
+    const activeEl = chatsListRef.current?.querySelector('[aria-current="true"]') as HTMLElement | null;
+    activeEl?.scrollIntoView({ block: 'nearest' });
+  }, [isFiltering]);
+
+
  return <>

    {/* Drawer Header */}
@@ -390,7 +408,7 @@ function ChatDrawer(props: {
      </Box>

      {/* Chat Titles List (shrink as half the rate as the Folders List) */}
-      <Box sx={{ flexGrow: 1, flexShrink: 1, flexBasis: '20rem', overflowY: 'auto', ...themeScalingMap[contentScaling].chatDrawerItemSx }}>
+      <Box key='chatlist' ref={chatsListRef} sx={{ flexGrow: 1, flexShrink: 1, flexBasis: '20rem', overflowY: 'auto', ...themeScalingMap[contentScaling].chatDrawerItemSx }}>
        {displayNavItems.map((item, idx) => item.type === 'nav-item-chat-data' ? (
            <ChatDrawerItemMemo
              key={'nav-chat-' + item.conversationId}
@@ -422,7 +440,7 @@ function ChatDrawer(props: {
                {filterHasStars && <StarOutlineRoundedIcon sx={{ color: 'primary.softColor', fontSize: 'xl', mb: -0.5, mr: 1 }} />}
                {item.message}
              </Typography>
-              {(filterHasStars || filterHasImageAssets || filterHasDocFragments || filterIsArchived) && (
+              {(filterHasBeamOpen || filterHasStars || filterHasImageAssets || filterHasDocFragments || filterIsArchived) && (
                <Tooltip title='Clear Filters'>
                  <IconButton size='sm' color='primary' onClick={clearFilters}>
                    <ClearIcon />
@@ -308,6 +308,7 @@ function ChatDrawerItem(props: {

    // Active or Also Open
    <Sheet
+      aria-current={isActive ? 'true' : undefined}
      variant={isActive ? 'solid' : 'outlined'}
      invertedColors={isActive}
      onClick={!isActive ? handleConversationActivate : undefined}
@@ -86,6 +86,7 @@ export function useChatDrawerRenderItems(
  filterByQuery: string,
  activeFolder: DFolder | null,
  allFolders: DFolder[],
+  filterHasBeamOpen: boolean,
  filterHasStars: boolean,
  filterHasImageAssets: boolean,
  filterHasDocFragments: boolean,
@@ -146,7 +147,8 @@ export function useChatDrawerRenderItems(
          }

          // filter for required attributes
-          if ((filterHasStars && !hasStars) || (filterHasImageAssets && !hasImages) || (filterHasDocFragments && !hasDocs))
+          const hasBeamOpen = openBeamConversationIds[_c.id];
+          if ((filterHasBeamOpen && !hasBeamOpen) || (filterHasStars && !hasStars) || (filterHasImageAssets && !hasImages) || (filterHasDocFragments && !hasDocs))
            return null;

          // rich properties
@@ -186,7 +188,7 @@ export function useChatDrawerRenderItems(
                ? allFolders.find(folder => folder.conversationIds.includes(_c.id)) ?? null
                : null,
            updatedAt: _c.updated || _c.created || 0,
-            hasBeamOpen: !!openBeamConversationIds?.[_c.id],
+            hasBeamOpen,
            messageCount,
            beingGenerated: !!_c._abortController, // FIXME: when the AbortController is moved at the message level, derive the state in the conv
            systemPurposeId: _c.systemPurposeId,
@@ -287,19 +289,21 @@ export function useChatDrawerRenderItems(
        renderNavItems.push({
          type: 'nav-item-info-message',
          message: (filterHasStars && (filterHasImageAssets || filterHasDocFragments)) ? 'No results'
-            : filterHasDocFragments ? 'No attachment results'
-              : filterHasImageAssets ? 'No image results'
-                : filterHasStars ? 'No starred results'
-                  : filterIsArchived ? 'No archived conversations'
-                    : isSearching ? 'Text not found'
-                      : 'No conversations in folder',
+            : filterHasBeamOpen ? 'No beam conversations'
+              : filterHasDocFragments ? 'No attachment results'
+                : filterHasImageAssets ? 'No image results'
+                  : filterHasStars ? 'No starred results'
+                    : filterIsArchived ? 'No archived conversations'
+                      : isSearching ? 'Text not found'
+                        : 'No conversations in folder',
        });
      } else {
        // filtering reminder (will be rendered with a clear button too)
-        if (filterHasStars || filterHasImageAssets || filterHasDocFragments || filterIsArchived) {
+        if (filterHasBeamOpen || filterHasStars || filterHasImageAssets || filterHasDocFragments || filterIsArchived) {
          renderNavItems.unshift({
            type: 'nav-item-info-message',
            message: `${filterIsArchived ? 'Showing' : 'Filtering by'} ${[
+              filterHasBeamOpen && 'beam',
              filterHasStars && 'stars',
              filterHasImageAssets && 'images',
              filterHasDocFragments && 'attachments',
@@ -2,9 +2,13 @@ import * as React from 'react';
 import TimeAgo from 'react-timeago';

 import { Box, Button, ButtonGroup, Tooltip, Typography } from '@mui/joy';
+import DownloadIcon from '@mui/icons-material/Download';
+import LinkOffRoundedIcon from '@mui/icons-material/LinkOffRounded';
 import PlayArrowRoundedIcon from '@mui/icons-material/PlayArrowRounded';
 import StopRoundedIcon from '@mui/icons-material/StopRounded';

+import type { AixReattachMode } from '~/modules/aix/client/aix.client';
+
 import type { DMessageGenerator } from '~/common/stores/chat/chat.message';


@@ -12,53 +16,65 @@ const ARM_TIMEOUT_MS = 4000;


 /**
- * FIXME: COMPLETE THIS
+ * Resume controls for an upstream-stored run.
+ *  - Resume:  SSE replay (live deltas) - canonical path. Always offered when onResume exists.
+ *  - Recover: one-shot JSON GET - shown only for vendors that benefit from it (Gemini Interactions).
+ *  - Detach:  abort the local fetch but leave the upstream run alive. Visible only when a resume
+ *             is in-flight (`inFlightMode != null`). Resume/Recover stay available afterwards.
+ *  - Stop:    terminate the upstream run + delete the resource.
+ *
+ * IMPORTANT: in-flight state is owned by the parent (`inFlightMode` + `onDetach`) so it survives
+ * remounts that happen while a long-running stream is active (e.g. Deep Research).
 */
 export function BlockOpUpstreamResume(props: {
  upstreamHandle: Exclude<DMessageGenerator['upstreamHandle'], undefined>,
-  onResume?: () => void | Promise<void>;
-  onCancel?: () => void | Promise<void>;
+  pending?: boolean; // true iff a local in-flight op (initial POST or resume); drives the state machine + hides the expiry footer
+  inFlightMode?: AixReattachMode; // set by the parent while a resume is in flight; drives the loading/Detach UI
+  onResume?: (mode: AixReattachMode) => void | Promise<void>;
+  onDetach?: () => void;
  onDelete?: () => void | Promise<void>;
 }) {

-  // state
-  const [isResuming, setIsResuming] = React.useState(false);
-  const [isCancelling, setIsCancelling] = React.useState(false);
+  // local state - only for short-lived ops the parent doesn't own
  const [isDeleting, setIsDeleting] = React.useState(false);
  const [deleteArmed, setDeleteArmed] = React.useState(false);
  const [error, setError] = React.useState<string | null>(null);

  // expiration: boolean is evaluated at render (may lag briefly if nothing re-renders past expiry).
-  // TimeAgo handles its own tick for the label; the button's disabled state is the only consumer of this flag.
-  const { expiresAt, runId = '' } = props.upstreamHandle;
-  const isExpired = expiresAt != null && Date.now() > expiresAt;
+  const { expiresAt /*, runId = ''*/ } = props.upstreamHandle;
+
+  // State machine - mutually exclusive triplet (idle | initial-POST | resume | recover):
+  //  - Idle           : !pending                     - run not active locally (incl. post-reload, since
+  //                                                    chats.converters.ts clears pendingIncomplete on hydrate).
+  //  - Initial POST   : pending && !inFlightMode     - first generation streaming.
+  //  - Resume replay  : pending && mode='replay'     - we own this resume cycle.
+  //  - Recover snap   : pending && mode='snapshot'   - we own this snapshot fetch.
+  //
+  // Visibility matrix (see BlockOpUpstreamResume props doc):
+  //                       Resume   Recover   Detach   Cancel
+  //   Idle                  ✅       ✅¹       —       ✅
+  //   Initial POST          —        —         —       ✅
+  //   Resume in flight      —        —         ✅      ✅
+  //   Recover in flight     —        ✅²       —       —
+  //   ¹ only for Gemini Interactions  ² with loading spinner
+  const isReplaying = props.inFlightMode === 'replay';
+  const isSnapshotting = props.inFlightMode === 'snapshot';
+  const isIdle = !props.pending;
+
+  const canRecoverVendor = props.upstreamHandle.uht === 'vnd.gem.interactions';
+  const showResume = isIdle && !!props.onResume;
+  const showRecover = (isIdle || isSnapshotting) && !!props.onResume && canRecoverVendor;
+  const showDetach = isReplaying && !!props.onDetach;
+  const showCancel = !isSnapshotting && !!props.onDelete;

  // handlers

-  const handleResume = React.useCallback(async () => {
+  const handleResume = React.useCallback((mode: AixReattachMode) => {
    if (!props.onResume) return;
    setError(null);
-    setIsResuming(true);
-    try {
-      await props.onResume();
-    } catch (err: any) {
-      setError(err?.message || 'Resume failed');
-    } finally {
-      setIsResuming(false);
-    }
-  }, [props]);
-
-  const handleCancel = React.useCallback(async () => {
-    if (!props.onCancel) return;
-    setError(null);
-    setIsCancelling(true);
-    try {
-      await props.onCancel();
-    } catch (err: any) {
-      setError(err?.message || 'Cancel failed');
-    } finally {
-      setIsCancelling(false);
-    }
+    // fire-and-forget: parent owns the promise lifecycle and the abort controller.
+    // If it rejects, the parent surfaces the error via its own UI; we stay silent.
+    Promise.resolve(props.onResume(mode)).catch(() => { /* parent handles */ });
  }, [props]);

  // Two-click arm: first click arms (visible red "Confirm?"), second click (within ARM_TIMEOUT_MS) executes.
@@ -87,7 +103,6 @@ export function BlockOpUpstreamResume(props: {
    return () => clearTimeout(t);
  }, [deleteArmed]);

-
  return (
    <Box
      sx={{
@@ -99,41 +114,53 @@ export function BlockOpUpstreamResume(props: {
      }}
    >
      <ButtonGroup>
-        {props.onResume && (
-          <Tooltip title='Resume generation from last checkpoint'>
+        {showResume && (
+          <Tooltip title='Resume by re-streaming from the upstream run'>
            <Button
-              disabled={isResuming || isCancelling || isDeleting || isExpired}
-              loading={isResuming}
+              disabled={isDeleting}
              startDecorator={<PlayArrowRoundedIcon color='success' />}
-              onClick={handleResume}
+              onClick={() => handleResume('replay')}
            >
              Resume
            </Button>
          </Tooltip>
        )}

-        {props.onCancel && (
-          <Tooltip title='Cancel the response generation'>
+        {showRecover && (
+          <Tooltip title='Fetch the result without streaming - recovers stuck or hung runs'>
            <Button
-              disabled={isResuming || isCancelling || isDeleting}
-              loading={isCancelling}
-              // startDecorator={<CancelIcon />}
-              onClick={handleCancel}
+              disabled={isDeleting}
+              loading={isSnapshotting}
+              loadingPosition='start'
+              startDecorator={<DownloadIcon />}
+              onClick={() => handleResume('snapshot')}
            >
-              Cancel
+              Recover
            </Button>
          </Tooltip>
        )}

-        {props.onDelete && (
-          <Tooltip title={deleteArmed ? 'Click again to confirm - cancels the run upstream (no resume after)' : 'Cancel the upstream run'}>
+        {showDetach && (
+          <Tooltip title='Close this connection only - the upstream run keeps going. Click Resume or Recover later to fetch results.'>
+            <Button
+              disabled={isDeleting}
+              startDecorator={<LinkOffRoundedIcon />}
+              onClick={props.onDetach}
+            >
+              Detach
+            </Button>
+          </Tooltip>
+        )}
+
+        {showCancel && (
+          <Tooltip title={deleteArmed ? 'Click again to confirm - cancels the upstream run and clears the handle' : 'Cancel the upstream run'}>
            <Button
              loading={isDeleting}
              color={deleteArmed ? 'danger' : 'neutral'}
              variant={deleteArmed ? 'solid' : 'outlined'}
              startDecorator={<StopRoundedIcon />}
              onClick={handleDelete}
-              disabled={isResuming || isCancelling || isDeleting}
+              disabled={isDeleting}
            >
              {deleteArmed ? 'Confirm?' : 'Cancel'}
            </Button>
@@ -147,7 +174,7 @@ export function BlockOpUpstreamResume(props: {
        </Typography>
      )}

-      {!!expiresAt && <Typography level='body-xs' sx={{ fontSize: '0.65rem', opacity: 0.6 }}>
+      {!props.pending && !!expiresAt && <Typography level='body-xs' sx={{ fontSize: '0.65rem', opacity: 0.6 }}>
        {/*Run ID: {runId.slice(0, 12)}...*/}
        {/*{!!expiresAt && <> · Expires <TimeAgo date={expiresAt} /></>}*/}
        Expires <TimeAgo date={expiresAt} />
@@ -29,6 +29,7 @@ import VerticalAlignBottomIcon from '@mui/icons-material/VerticalAlignBottom';
 import VisibilityIcon from '@mui/icons-material/Visibility';
 import VisibilityOffIcon from '@mui/icons-material/VisibilityOff';

+import type { AixReattachMode } from '~/modules/aix/client/aix.client';
 import { ModelVendorAnthropic } from '~/modules/llms/vendors/anthropic/anthropic.vendor';

 import { AnthropicIcon } from '~/common/components/icons/vendors/AnthropicIcon';
@@ -161,8 +162,10 @@ export function ChatMessage(props: {
  onMessageBeam?: (messageId: string) => Promise<void>,
  onMessageBranch?: (messageId: string) => void,
  onMessageContinue?: (messageId: string, continueText: null | string) => void,
-  onMessageUpstreamResume?: (generator: DMessageGenerator, messageId: string) => Promise<void>,
+  onMessageUpstreamResume?: (generator: DMessageGenerator, messageId: string, mode: AixReattachMode) => Promise<void>,
+  onMessageUpstreamDetach?: (messageId: string) => void,
  onMessageUpstreamDelete?: (generator: DMessageGenerator, messageId: string) => Promise<void>,
+  upstreamResumeMode?: AixReattachMode, // set by parent while a resume is in flight on this message
  onMessageDelete?: (messageId: string) => void,
  onMessageFragmentAppend?: (messageId: DMessageId, fragment: DMessageFragment) => void
  onMessageFragmentDelete?: (messageId: DMessageId, fragmentId: DMessageFragmentId) => void,
@@ -247,7 +250,7 @@ export function ChatMessage(props: {
  // const wordsDiff = useWordsDifference(textSubject, props.diffPreviousText, showDiff);


-  const { onMessageAssistantFrom, onMessageDelete, onMessageFragmentAppend, onMessageFragmentDelete, onMessageFragmentReplace, onMessageContinue, onMessageUpstreamResume, onMessageUpstreamDelete } = props;
+  const { onMessageAssistantFrom, onMessageDelete, onMessageFragmentAppend, onMessageFragmentDelete, onMessageFragmentReplace, onMessageContinue, onMessageUpstreamResume, onMessageUpstreamDetach, onMessageUpstreamDelete } = props;

  const handleFragmentNew = React.useCallback(() => {
    onMessageFragmentAppend?.(messageId, createTextContentFragment(''));
@@ -265,11 +268,15 @@ export function ChatMessage(props: {
    onMessageContinue?.(messageId, continueText);
  }, [messageId, onMessageContinue]);

-  const handleUpstreamResume = React.useCallback(() => {
+  const handleUpstreamResume = React.useCallback((mode: AixReattachMode) => {
    if (!messageGenerator) return;
-    return onMessageUpstreamResume?.(messageGenerator, messageId);
+    return onMessageUpstreamResume?.(messageGenerator, messageId, mode);
  }, [messageGenerator, messageId, onMessageUpstreamResume]);

+  const handleUpstreamDetach = React.useCallback(() => {
+    onMessageUpstreamDetach?.(messageId);
+  }, [messageId, onMessageUpstreamDetach]);
+
  const handleUpstreamDelete = React.useCallback(() => {
    if (!messageGenerator) return;
    return onMessageUpstreamDelete?.(messageGenerator, messageId);
@@ -898,11 +905,14 @@ export function ChatMessage(props: {
            />
          )}

-          {/* Upstream Resume - shows whenever there's a stored handle (incl. post-reload, where no error fragment is present) */}
-          {!messagePendingIncomplete && props.isBottom && fromAssistant && messageGenerator?.upstreamHandle && (!!onMessageUpstreamResume || !!onMessageUpstreamDelete) && (
+          {/* Upstream Resume - shows whenever there's a stored handle (incl. post-reload, and while streaming so Stop can cancel the upstream run) */}
+          {props.isBottom && fromAssistant && messageGenerator?.upstreamHandle && (!!onMessageUpstreamResume || !!onMessageUpstreamDelete) && (
            <BlockOpUpstreamResume
              upstreamHandle={messageGenerator.upstreamHandle}
+              pending={messagePendingIncomplete}
+              inFlightMode={props.upstreamResumeMode}
              onResume={onMessageUpstreamResume ? handleUpstreamResume : undefined}
+              onDetach={onMessageUpstreamDetach ? handleUpstreamDetach : undefined}
              onDelete={onMessageUpstreamDelete ? handleUpstreamDelete : undefined}
            />
          )}
@@ -149,6 +149,10 @@ export function ContentFragments(props: {
          //   return null;

          case 'ma':
+            // skip rendering empty reasoning fragments (created as vehicles for vendor state / reasoning continuity)
+            const isActivelyStreaming = isLastFragment && !!props.messagePendingIncomplete;
+            if (!part.aText && !part.redactedData?.length && !isActivelyStreaming)
+              return null;
            const BlockPartModelAuxMemoOrNot = optimizeMemoBeforeLastBlock ? BlockPartModelAuxMemo : BlockPartModelAux;
            return (
              <BlockPartModelAuxMemoOrNot
@@ -166,9 +166,9 @@ export function AppChatSettingsAI() {
      tooltip={<>
        When Claude uses tools like code execution, it may produce text and image files stored in Anthropic&apos;s File API. This setting controls whether Big-AGI should automatically download and embed them in the chat.
        <ul>
-          <li><b>Off</b>: keep as references (default).</li>
-          <li><b>Inline</b>: download and embed text/images.</li>
-          <li><b>Inline + Free</b>: embed, then delete from Anthropic to free storage.</li>
+          <li><b>Show</b>: keep as references.</li>
+          <li><b>Embed</b>: download and embed text/images (default).</li>
+          <li><b>Embed + Free</b>: embed, then delete from Anthropic to free storage.</li>
        </ul>
        Only affects Anthropic models.
      </>}
@@ -23,7 +23,7 @@ export const Release = {

  // this is here to trigger revalidation of data, e.g. models refresh
  Monotonics: {
-    Aix: 67,
+    Aix: 70,
    NewsVersion: 204,
  },

@@ -5,6 +5,7 @@ import { bareBonesPromptMixer } from '~/modules/persona/pmix/pmix';
 import { SystemPurposes } from '../../data';

 import { BeamStore, createBeamVanillaStore } from '~/modules/beam/store-beam_vanilla';
+import { autoConversationTitle } from '~/modules/aifn/autotitle/autoTitle';
 import { useModuleBeamStore } from '~/modules/beam/store-module-beam';

 import type { DConversationId } from '~/common/stores/chat/chat.conversation';
@@ -275,6 +276,10 @@ export class ConversationHandler {

      // close beam
      terminateKeepingSettings();
+
+      // auto-title the conversation if enabled (parity with chat-persona flow — fixes #1078)
+      if (getChatAutoAI().autoTitleChat)
+        void autoConversationTitle(this.conversationId, false);
    };

    beamOpen(viewHistory, getChatLLMId(), !!destReplaceMessageId, onBeamSuccess);
@@ -19,6 +19,7 @@ import { StarIconUnstyled, StarredNoXL2 } from '~/common/components/StarIcons';
 import { TooltipOutlined } from '~/common/components/TooltipOutlined';
 import { findModelsServiceOrNull, getChatLLMId, llmsStoreActions } from '~/common/stores/llms/store-llms';
 import { optimaActions, optimaOpenModels } from '~/common/layout/optima/useOptima';
+import { sortLLMsByServiceLabel } from '~/common/stores/llms/components/llms.dropdown.utils';
 import { useToggleableStringSet } from '~/common/util/hooks/useToggleableStringSet';
 import { useUIPreferencesStore } from '~/common/stores/store-ui';
 import { useVisibleLLMs } from '~/common/stores/llms/llms.hooks';
@@ -202,12 +203,15 @@ export function useLLMSelect(
  const optimizeToSingleVisibleId = (!controlledOpen && _filteredLLMs.length > LLM_SELECT_REDUCE_OPTIONS) ? llmId : null; // id to keep visible when optimizing

  const optionsArray = React.useMemo(() => {
+    // sort LLMs alphabetically by service label so vendor groups appear in a stable order (groups remain contiguous because sort is stable on equal keys)
+    const sortedLLMs = sortLLMsByServiceLabel(_filteredLLMs);
+
    // check if we have multiple services (to show collapsible headers)
-    const hasMultipleServices = _filteredLLMs.some((llm, i, arr) => i > 0 && llm.sId !== arr[i - 1].sId);
+    const hasMultipleServices = sortedLLMs.some((llm, i, arr) => i > 0 && llm.sId !== arr[i - 1].sId);

    // create the option items
    let prevServiceId: DModelsServiceId | null = null;
-    return _filteredLLMs.reduce((acc, llm, _index) => {
+    return sortedLLMs.reduce((acc, llm, _index) => {

      if (optimizeToSingleVisibleId && llm.id !== optimizeToSingleVisibleId)
        return acc;
@@ -103,7 +103,13 @@ export type DMessageFragmentVendorState = Record<string, unknown> & {
    thoughtSignature?: string; // Gemini 3+ - echoed back to maintain reasoning context
  };
  openai?: {
-    // Responses API reasoning item continuity handle
+    // Responses API reasoning item continuity handle.
+    // IMPORTANT: OpenAI-private encryption + server-side item id; never round-trip to xAI.
+    reasoningItem?: { id?: string; encryptedContent?: string; };
+  };
+  xai?: {
+    // xAI Responses API reasoning item continuity handle.
+    // IMPORTANT: xAI-private encryption + server-side item id; never round-trip to OpenAI.
    reasoningItem?: { id?: string; encryptedContent?: string; };
  };
  // Future: anthropic?: { ... }
@@ -42,17 +42,44 @@ export interface LLMServiceGroup {
 }

 /**
- * Group LLMs by service, resolving service display labels.
+ * Resolve display label for each unique service in the input.
+ * Fallback chain: service.label -> vendor.name -> service.id.
+ */
+function _resolveServiceLabels(llms: ReadonlyArray<DLLM>): Map<DModelsServiceId, string> {
+  const labelById = new Map<DModelsServiceId, string>();
+  for (const llm of llms) {
+    if (labelById.has(llm.sId)) continue;
+    const vendor = findModelVendor(llm.vId);
+    labelById.set(llm.sId, findModelsServiceOrNull(llm.sId)?.label || vendor?.name || llm.sId);
+  }
+  return labelById;
+}
+
+/**
+ * Stably sort LLMs by their service label (alphabetical, locale-aware).
+ * Preserves intra-service order (e.g. starred-first), since JS sort is stable.
+ */
+export function sortLLMsByServiceLabel<T extends DLLM>(llms: ReadonlyArray<T>): T[] {
+  if (llms.length < 2) return [...llms];
+  const labelById = _resolveServiceLabels(llms);
+  return [...llms].sort((a, b) => labelById.get(a.sId)!.localeCompare(labelById.get(b.sId)!));
+}
+
+/**
+ * Group LLMs by service, alphabetically sorted by service label.
+ * Preserves intra-service order.
 */
 export function groupLLMsByService(llms: ReadonlyArray<DLLM>): LLMServiceGroup[] {
+  const labelById = _resolveServiceLabels(llms);
+  if (llms.length >= 2)
+    llms = [...llms].sort((a, b) => labelById.get(a.sId)!.localeCompare(labelById.get(b.sId)!));
+
  const groups: LLMServiceGroup[] = [];
  let currentGroup: LLMServiceGroup | null = null;

  for (const llm of llms) {
    if (!currentGroup || currentGroup.serviceId !== llm.sId) {
-      const vendor = findModelVendor(llm.vId);
-      const serviceLabel = findModelsServiceOrNull(llm.sId)?.label || vendor?.name || llm.sId;
-      currentGroup = { serviceId: llm.sId, serviceLabel, models: [] };
+      currentGroup = { serviceId: llm.sId, serviceLabel: labelById.get(llm.sId)!, models: [] };
      groups.push(currentGroup);
    }
    currentGroup.models.push(llm);
@@ -175,7 +175,8 @@ export const DModelParameterRegistry = {
    label: 'Thinking',
    type: 'enum',
    description: 'Enable or disable extended thinking mode.',
-    values: ['none', 'high'],
+    values: ['none', 'high', 'max'],
+    // 'max' is for now DeepSeek V4-specific (reasoning_effort=max); other vendors restrict via enumValues
    // undefined means vendor default (usually 'high', i.e. thinking enabled)
  }),

@@ -348,6 +349,15 @@ export const DModelParameterRegistry = {
    // when undefined, the model chooses automatically
  },

+  // Gemini Interactions API agent_config - per-agent knobs (Deep Research only today)
+  llmVndGeminiAgentViz: _enumDef({
+    label: 'Visualizations',
+    type: 'enum',
+    description: 'Charts and images in Deep Research reports. Disable for text-only output (helpful when merging multiple reports).',
+    values: ['auto', 'off'],
+    // undefined means upstream default ('auto'); we only forward when explicitly 'off'
+  }),
+
  // NOTE: we don't have this as a parameter, as for now we use it in tandem with llmVndGeminiGoogleSearch
  // llmVndGeminiUrlContext: {
  //   label: 'URL Context',
@@ -25,6 +25,7 @@ export interface DLLM {
  label: string;
  created: number | 0;
  updated?: number | 0;
+  pubDate?: string; // official release date in 'YYYYMMDD'
  description: string;
  hidden: boolean;

@@ -137,6 +138,20 @@ export function getLLMMaxOutputTokens(llm: DLLM | null): DLLMMaxOutputTokens | u
  return llm.userMaxOutputTokens ?? llm.maxOutputTokens;
 }

+/**
+ * Parse the model's editorial `pubDate` ('YYYYMMDD') into a Date, or null if missing/malformed.
+ * Date is constructed at local midnight - pubDate is day-precision, no time component.
+ */
+export function getLLMPubDate(llm: DLLM | null | undefined): Date | null {
+  const p = llm?.pubDate;
+  if (!p || !/^\d{8}$/.test(p)) return null;
+  const y = parseInt(p.slice(0, 4), 10);
+  const m = parseInt(p.slice(4, 6), 10) - 1; // JS Date months are 0-indexed
+  const d = parseInt(p.slice(6, 8), 10);
+  const date = new Date(y, m, d);
+  return Number.isFinite(date.getTime()) ? date : null;
+}
+
 /// Interfaces ///

 // do not change anything below! those will be persisted in data
@@ -49,7 +49,7 @@ export async function autoConversationTitle(conversationId: string, forceReplace
      autoTitleLlmId,
      'You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.',
      `Analyze the given short conversation (every line is truncated) and extract a concise chat title that summarizes the conversation in as little as a couple of words.
-Only respond with the lowercase short title and nothing else.
+Only respond with the short title and nothing else.

 \`\`\`
 ${historyLines.join('\n')}
@@ -834,11 +834,11 @@ export class ContentReassembler {

  }

-  private onSetVendorState(vs: Extract<AixWire_Particles.PartParticleOp, { p: 'svs' }>): void {
+  private onSetVendorState({ state, vendor }: Extract<AixWire_Particles.PartParticleOp, { p: 'svs' }>): void {

    // Promote Anthropic container state -> Generator (message-scoped, for cross-turn reuse)
-    if (vs.vendor === 'anthropic' && 'container' in vs.state) {
-      const { id, expiresAt } = vs.state.container;
+    if (vendor === 'anthropic' && 'container' in state) {
+      const { id, expiresAt } = state.container;
      if (id && expiresAt)
        this.S.generator = {
          ...this.S.generator,
@@ -855,11 +855,12 @@ export class ContentReassembler {
      return;
    }

-    // Guard: OpenAI reasoningItem state must land on the ma (reasoning) fragment that produced it.
+    // Guard: reasoningItem state must land on the ma (reasoning) fragment that produced it.
    // If no summary was appended during the reasoning item (summary disabled / skipped), the last
    // fragment will belong to an unrelated preceding item - dropping the handle is safer than contaminating.
-    if (vs.vendor === 'openai' && 'reasoningItem' in vs.state && lastFragment.part.pt !== 'ma') {
-      console.warn('[ContentReassembler] OpenAI reasoningItem state without preceding ma fragment - dropping continuity handle', { lastFragmentPt: lastFragment.part.pt });
+    // Applies to both OpenAI and xAI namespaces; each is opaque/private to its producing vendor.
+    if ((vendor === 'openai' || vendor === 'xai') && 'reasoningItem' in state && lastFragment.part.pt !== 'ma') {
+      console.warn(`[ContentReassembler] ${vendor} reasoningItem state without preceding ma fragment - dropping continuity handle`, { lastFragmentPt: lastFragment.part.pt });
      return;
    }

@@ -868,7 +869,7 @@ export class ContentReassembler {
      ...lastFragment,
      vendorState: {
        ...lastFragment.vendorState,
-        [vs.vendor]: vs.state,
+        [vendor]: state,
      },
    });
  }
@@ -905,9 +906,18 @@ export class ContentReassembler {
  /**
   * Stores raw termination data from the wire - classification deferred to finalizeReassembly()
   */
-  private onCGEnd({ terminationReason, tokenStopReason }: Extract<AixWire_Particles.ChatGenerateOp, { cg: 'end' }>): void {
+  private onCGEnd({ terminationReason, tokenStopReason, tokenStopError }: Extract<AixWire_Particles.ChatGenerateOp, { cg: 'end' }>): void {
+    // Diagnostic: detect late 'end' particles overriding a prior termination (parser bug, replayed wire, or upstream advisory after a clean end).
+    // Behavior unchanged - we still apply the override - but the warning makes the override visible client-side, mirroring the server-side
+    // 'setDialectEnded ... (overriding)' warning in ChatGenerateTransmitter and the existing setClientAborted/setClientExcepted warnings here.
+    if (this.S.terminationReason)
+      console.warn(`[DEV] [ContentReassembler] onCGEnd: overriding prior termination '${this.S.terminationReason}' with '${terminationReason}' (wire stop: ${this.S.dialectStopReason ?? 'none'} -> ${tokenStopReason ?? 'none'})`);
+
    this.S.terminationReason = terminationReason;
    this.S.dialectStopReason = tokenStopReason;
+    // Vendor-composed stop error, surfaced as a complementary error fragment alongside the generic classification message
+    if (tokenStopError)
+      this._appendErrorFragment(tokenStopError);
  }

  /**
@@ -989,6 +999,11 @@ export class ContentReassembler {
  }

  private onCGIssue({ issueId: _issueId /* Redundant as we add an Error Fragment already */, issueText, issueHint }: Extract<AixWire_Particles.ChatGenerateOp, { cg: 'issue' }> & { issueHint?: DMessageErrorPart['hint'] }): void {
+    // Diagnostic: detect issue particles arriving after a clean termination (e.g. OpenAI rate-limit advisory after response.completed).
+    // Behavior unchanged - the issue is still appended - but the warning surfaces that we are mutating a finished message.
+    if (this.S.terminationReason && this.S.terminationReason === 'done-dialect')
+      console.warn(`[DEV] [ContentReassembler] onCGIssue: appending issue after clean '${this.S.terminationReason}' (wire stop: ${this.S.dialectStopReason ?? 'none'}): ${issueText}`);
+
    // NOTE: not sure I like the flow at all here
    // there seem to be some bad conditions when issues are raised while the active part is not text
    if (MERGE_ISSUES_INTO_TEXT_PART_IF_OPEN) {
@@ -409,11 +409,15 @@ export async function aixCGR_ChatSequence_FromDMessagesOrThrow(
            break;

          case 'ma':
-            // Preserve reasoning continuity across turns. Two channels, any one is sufficient:
+            // Preserve reasoning continuity across turns. Three channels, any one is sufficient:
            // - Anthropic: part.textSignature / part.redactedData (bespoke fields, see Anthropic extended thinking docs)
-            // - OpenAI/Gemini: _vnd sidecar (reasoningItem.* / thoughtSignature, generic vendor-state mechanism)
+            // - OpenAI Responses / Gemini: _vnd sidecar (reasoningItem.* / thoughtSignature, opaque continuity handle)
+            // - DeepSeek V4 (OpenAI chat-completions): plain reasoning text in aText is the payload itself
            const oaiReasoning = _vnd?.openai?.reasoningItem;
-            const hasReasoningHandle = aPart.textSignature || aPart.redactedData?.length || oaiReasoning?.encryptedContent || oaiReasoning?.id;
+            const hasReasoningHandle =
+              (aPart.textSignature || aPart.redactedData?.length)
+              || (oaiReasoning?.encryptedContent || oaiReasoning?.id)
+              || (aPart.aText && aPart.aType === 'reasoning'); // DeepSeek V4 reasoning in plain text - NOTE: will send LOTS of 'ma' parts (e.g. to Gemini, which doesn't even need them)
            if (hasReasoningHandle) {
              const aModelAuxPart = aPart as AixParts_ModelAuxPart; // NOTE: this is a forced cast from readonly string[] to string[], but not a big deal here
              modelMessage.parts.push(_vnd ? { ...aModelAuxPart, _vnd } : aModelAuxPart);
@@ -653,7 +657,7 @@ function _clientCreateAixMetaInReferenceToPart(items: DMetaReferenceItem[]): Aix


 export async function clientHotFixGenerateRequest_ApplyAll(llmInterfaces: DLLM['interfaces'], aixChatGenerate: AixAPIChatGenerate_Request, modelName: string): Promise<{
-  shallDisableStreaming: boolean;
+  hotfixNoStream: boolean;
  workaroundsCount: number;
 }> {

@@ -676,12 +680,12 @@ export async function clientHotFixGenerateRequest_ApplyAll(llmInterfaces: DLLM['
    workaroundsCount += await clientHotFixGenerateRequest_ConvertWebP(aixChatGenerate, 'image/jpeg');

  // Disable streaming for select chat models that don't support it (e.g. o1-preview (old) and o1-2024-12-17)
-  const shallDisableStreaming = llmInterfaces.includes(LLM_IF_HOTFIX_NoStream);
+  const hotfixNoStream = llmInterfaces.includes(LLM_IF_HOTFIX_NoStream);

  if (workaroundsCount > 0)
    console.warn(`[DEV] Working around '${modelName}' model limitations: client-side applied ${workaroundsCount} workarounds`);

-  return { shallDisableStreaming, workaroundsCount };
+  return { hotfixNoStream, workaroundsCount };

 }

@@ -37,7 +37,7 @@ export async function* clientSideChatGenerate(
      return dispatch;
    });

-  yield* executeChatGenerateWithContinuation(dispatchCreator, streaming, abortSignal, _d);
+  yield* executeChatGenerateWithContinuation(dispatchCreator, abortSignal, _d);
 }

 /**
@@ -48,7 +48,7 @@ export async function* clientSideReattachUpstream(
  access: AixAPI_Access,
  resumeHandle: AixAPI_ResumeHandle,
  context: AixAPI_Context_ChatGenerate,
-  streaming: true,
+  streaming: boolean,
  connectionOptions: AixAPI_ConnectionOptions_ChatGenerate,
  abortSignal: AbortSignal,
 ): AsyncGenerator<AixWire_Particles.ChatGenerateOp, void> {
@@ -56,7 +56,7 @@ export async function* clientSideReattachUpstream(
  const _d: AixDebugObject = _createClientDebugConfig(access, connectionOptions, context.name);
  const dispatchCreator = () => createChatGenerateResumeDispatch(access, resumeHandle, streaming);

-  yield * executeChatGenerateWithContinuation(dispatchCreator, streaming, abortSignal, _d);
+  yield * executeChatGenerateWithContinuation(dispatchCreator, abortSignal, _d);
 }

 /**
@@ -70,7 +70,7 @@ export function aixCreateModelFromLLMOptions(
    llmVndAntEffort, llmVndGemEffort, llmVndOaiEffort, llmVndMiscEffort,
    llmVndAnt1MContext, llmVndAntInfSpeed, llmVndAntSkills, llmVndAntThinkingBudget, llmVndAntWebDynamic, llmVndAntWebFetch, llmVndAntWebFetchMaxUses, llmVndAntWebSearch, llmVndAntWebSearchMaxUses,
    llmVndBedrockAPI,
-    llmVndGeminiAspectRatio, llmVndGeminiImageSize, llmVndGeminiCodeExecution, llmVndGeminiComputerUse, llmVndGeminiGoogleSearch, llmVndGeminiMediaResolution, llmVndGeminiThinkingBudget,
+    llmVndGeminiAgentViz, llmVndGeminiAspectRatio, llmVndGeminiImageSize, llmVndGeminiCodeExecution, llmVndGeminiComputerUse, llmVndGeminiGoogleSearch, llmVndGeminiMediaResolution, llmVndGeminiThinkingBudget,
    // llmVndMoonshotWebSearch,
    llmVndOaiRestoreMarkdown, llmVndOaiVerbosity, llmVndOaiWebSearchContext, llmVndOaiWebSearchGeolocation, llmVndOaiImageGeneration, llmVndOaiCodeInterpreter,
    llmVndOrtWebSearch,
@@ -143,6 +143,7 @@ export function aixCreateModelFromLLMOptions(

    // Gemini
    ...(llmVndGeminiInteractions ? { vndGeminiAPI: 'interactions-agent' } : {}),
+    ...(llmVndGeminiAgentViz === 'off' ? { vndGeminiAgentViz: 'off' } : {}), // Deep Research agent_config.visualization - only forward when explicitly disabled
    ...(llmVndGeminiAspectRatio ? { vndGeminiAspectRatio: llmVndGeminiAspectRatio } : {}),
    ...(llmVndGeminiCodeExecution === 'auto' ? { vndGeminiCodeExecution: llmVndGeminiCodeExecution } : {}),
    ...(llmVndGeminiComputerUse ? { vndGeminiComputerUse: llmVndGeminiComputerUse } : {}),
@@ -342,7 +343,7 @@ export async function aixChatGenerateText_Simple(
  aixContextRef: AixAPI_Context_ChatGenerate['ref'],
  // optional options
  clientOptions?: Partial<AixClientOptions>, // this makes the abortController optional
-  // optional callback for streaming
+  // optional callback - if provided, streaming is activated
  onTextStreamUpdate?: (text: string, isDone: boolean, generator: DMessageGenerator) => MaybePromise<void>,
 ): Promise<string> {

@@ -363,14 +364,13 @@ export async function aixChatGenerateText_Simple(
  // Aix Context
  const aixContext = aixCreateChatGenerateContext(aixContextName, aixContextRef);

-  // Aix Streaming - implicit if the callback is provided
-  let aixStreaming = !!onTextStreamUpdate;
+  // Caller streaming preference - implicit: stream if a callback is provided
+  const callerStreaming = !!onTextStreamUpdate;


  // Client-side late stage model HotFixes
-  const { shallDisableStreaming } = await clientHotFixGenerateRequest_ApplyAll(llm.interfaces, aixChatGenerate, llmParameters.llmRef || llm.id);
-  if (shallDisableStreaming || aixModel.forceNoStream)
-    aixStreaming = false;
+  const { hotfixNoStream } = await clientHotFixGenerateRequest_ApplyAll(llm.interfaces, aixChatGenerate, llmParameters.llmRef || llm.id);
+  const wireStreaming = !hotfixNoStream && !aixModel.forceNoStream ? callerStreaming : false;


  // Variable to store the final text
@@ -398,11 +398,11 @@ export async function aixChatGenerateText_Simple(
    aixModel,
    aixChatGenerate,
    aixContext,
-    aixStreaming,
+    wireStreaming,
    state.generator,
    abortSignal,
    clientOptions?.throttleParallelThreads ?? 0,
-    !aixStreaming ? undefined : async (ll: AixChatGenerateContent_LL, _isDone: boolean /* we want to issue this, in case the next action is an exception */) => {
+    !onTextStreamUpdate ? undefined : async (ll: AixChatGenerateContent_LL, _isDone: boolean /* we want to issue this, in case the next action is an exception */) => {
      _llToL2Simple(ll, state);
      if (onTextStreamUpdate && state.text !== null)
        await onTextStreamUpdate(state.text, false, state.generator);
@@ -521,7 +521,7 @@ type _AixChatGenerateContent_DMessageGuts_WithOutcome = AixChatGenerateContent_D
 * @param llmId - ID of the Language Model to use
 * @param aixChatGenerate - Multi-modal chat generation request specifics, including Tools and high-level metadata
 * @param aixContext - Information about how this chat generation is being used
- * @param aixStreaming - Whether to use streaming for generation
+ * @param aixStreaming - Caller's wire-streaming preference. Subject to override by model/hotfix constraints, or dispatch constraints
 * @param clientOptions - Client options for the operation
 * @param onStreamingUpdate - Optional callback for streaming updates
 *
@@ -551,10 +551,9 @@ export async function aixChatGenerateContent_DMessage_orThrow<TServiceSettings e
    vndAntTransformInlineFiles: aixAccess.dialect === 'anthropic' ? getVndAntInlineFiles() : undefined,
  });

-  // Client-side late stage model HotFixes
-  const { shallDisableStreaming } = await clientHotFixGenerateRequest_ApplyAll(llm.interfaces, aixChatGenerate, llmParameters.llmRef || llm.id);
-  if (shallDisableStreaming || aixModel.forceNoStream)
-    aixStreaming = false;
+  // Client-side late stage model HotFixes - collapse the caller's requested streaming preference into the effective wire-streaming decision after constraints (hotfix gate, model.forceNoStream)
+  const { hotfixNoStream } = await clientHotFixGenerateRequest_ApplyAll(llm.interfaces, aixChatGenerate, llmParameters.llmRef || llm.id);
+  const wireStreaming = !hotfixNoStream && !aixModel.forceNoStream ? aixStreaming : false;

  // Legacy Note: awaited OpenAI moderation check was removed (was only on this codepath)

@@ -584,7 +583,7 @@ export async function aixChatGenerateContent_DMessage_orThrow<TServiceSettings e
    aixModel,
    aixChatGenerate,
    aixContext,
-    aixStreaming,
+    wireStreaming,
    dMessage.generator,
    clientOptions.abortSignal,
    clientOptions.throttleParallelThreads ?? 0,
@@ -646,22 +645,30 @@ function _finalizeLlmMetricsWithCosts(cgMetricsLg: undefined | DMetricsChatGener

 // --- L2 - Content Generation reattachment as DMessage ---

+/**
+ * Reattach mode selects how to reconstruct an in-progress upstream run:
+ *  - 'replay'   - canonical: SSE replays the event sequence from the start. Live deltas reach
+ *                 the UI as the run progresses (or as past content is replayed).
+ *  - 'snapshot' - one-shot JSON GET returns the resource as-is right now. Used to recover when
+ *                 the SSE endpoint is broken upstream but the resource itself is still readable.
+ *
+ * Names describe what you get, not how. See `kb/modules/LLM-gemini-interactions.md` for failure modes.
+ */
+export type AixReattachMode = 'replay' | 'snapshot';
+
 /**
 * Reattach facade: wraps `aixChatGenerateContent_DMessage_orThrow` for the reattach-to-upstream flow.
+ * - Validates the generator carries an `upstreamHandle`
+ * - Stubs the unused chat-generate request, and
+ * - Seeds the base function so the LL's reattach branch fires.
 *
- * On an in-progress upstream run (Gemini Deep Research today, extensible to OAI Responses), the server
- * just needs the handle to GET-poll; no chat-generate body is needed. This facade:
- * - validates the generator carries an `upstreamHandle`,
- * - stubs the chat-generate request (unused on the reattach path - the server uses the handle),
- * - seeds the base function via `clientOptions.reattachGenerator` so the LL's reattach branch fires.
- *
- * The reassembler starts with empty fragments; since Gemini Interactions snapshots are cumulative,
- * the stream will rebuild the complete content from scratch. Any partial content from the original run is replaced.
+ * The reassembler replaces content on reattach (Gemini Interactions snapshots are cumulative, so this rebuilds from scratch).
 */
 export async function aixReattachContent_DMessage_orThrow(
  llmId: DLLMId,
  reattachGenerator: Readonly<DMessageGenerator>,
  aixContext: AixAPI_Context_ChatGenerate,
+  mode: AixReattachMode,
  clientOptions: Pick<AixClientOptions, 'abortSignal' | 'throttleParallelThreads'>,
  onStreamingUpdate?: (update: AixChatGenerateContent_DMessageGuts, isDone: boolean) => MaybePromise<void>,
 ): Promise<_AixChatGenerateContent_DMessageGuts_WithOutcome> {
@@ -676,7 +683,7 @@ export async function aixReattachContent_DMessage_orThrow(
    llmId,
    stubChatGenerate,
    aixContext,
-    true, // streaming
+    mode === 'replay', // wire-level: SSE demuxer (replay) vs one-shot JSON body (snapshot)
    { ...clientOptions, reattachGenerator: reattachGenerator as any /* guaranteed by the check */ },
    onStreamingUpdate,
  );
@@ -753,7 +760,7 @@ export type AixChatGenerateTerminal_LL = 'completed' | 'aborted' | 'failed';
 *
 * Contract:
 * - empty fragments means no content yet, and no error
- * - aixStreaming hints the source, but can be respected or not
+ * - wireStreaming hints the wire transport (SSE vs single response), but can be respected or not by the dispatch (e.g. SSE-only APIs ignore a `false` value)
 *   - onReassemblyUpdate is optional, you can ignore the updates and await the final result
 * - errors become Error fragments, and they can be dialect-sent, dispatch-excepts, client-read issues or even user aborts
 *   - DOES NOT THROW, but the final accumulator may contain error fragments
@@ -772,7 +779,7 @@ export type AixChatGenerateTerminal_LL = 'completed' | 'aborted' | 'failed';
 *    - special parts include 'In Reference To' (a decorator of messages)
 *    - other special parts include the Anthropic Caching hints, on select message
 * @param aixContext specifies the scope of the caller, such as what's the high level objective of this call
- * @param aixStreaming requests the source to provide incremental updates
+ * @param wireStreaming the effective wire-level streaming decision (already collapsed from caller preference + model/hotfix constraints); drives tRPC `streaming` field and downstream dispatch body shape
 * @param initialGenerator generator initial value, which will be updated for every new piece of information received
 * @param abortSignal allows the caller to stop the operation
 * @param throttleParallelThreads allows the caller to limit the number of parallel threads
@@ -790,7 +797,7 @@ async function _aixChatGenerateContent_LL(
  aixModel: AixAPI_Model,
  aixChatGenerate: AixAPIChatGenerate_Request,
  aixContext: AixAPI_Context_ChatGenerate,
-  aixStreaming: boolean,
+  wireStreaming: boolean,
  // others
  initialGenerator: DMessageGenerator,
  abortSignal: AbortSignal,
@@ -804,10 +811,13 @@ async function _aixChatGenerateContent_LL(
  const inspectorTransport = !inspectorEnabled ? undefined : aixAccess.clientSideFetch ? 'csf' : 'trpc';
  const inspectorContext = !inspectorEnabled ? undefined : { contextName: aixContext.name, contextRef: aixContext.ref };

-  // [DEV] Inspector - request body override
+  // Inspector - override request body
  const requestBodyOverrideJson = inspectorEnabled && aixClientDebuggerGetRBO();
  const debugRequestBodyOverride = !requestBodyOverrideJson ? false : JSON.parse(requestBodyOverrideJson);

+  // Inspector - force disable streaming (note: dispatches may still override this)
+  if (getAixDebuggerNoStreaming()) wireStreaming = false;
+
  /**
   * FIXME: implement client selection of resumability - aixAccess option?
   * NOTE: for Gemini Deep Research, it's on by default, so both auto-reattach on network breaks (currently disabled)
@@ -827,8 +837,11 @@ async function _aixChatGenerateContent_LL(
  // [CSF] Pre-load client-side executor if needed - type inference works here, no need to type
  let clientSideChatGenerate;
  let clientSideReattachUpstream;
-  if (aixAccess.clientSideFetch)
-    ({ clientSideChatGenerate, clientSideReattachUpstream } = await _loadCsfModuleOrThrow());
+  if (aixAccess.clientSideFetch) {
+    const csf = await _loadCsfModuleOrThrow();
+    clientSideChatGenerate = csf.clientSideChatGenerate;
+    clientSideReattachUpstream = csf.clientSideReattachUpstream;
+  }


  // Client-side particle transforms:
@@ -891,7 +904,7 @@ async function _aixChatGenerateContent_LL(
          aixModel,
          aixChatGenerate,
          aixContext,
-          getAixDebuggerNoStreaming() ? false : aixStreaming,
+          wireStreaming,
          aixConnectionOptions,
          abortSignal,
        ) :
@@ -901,7 +914,7 @@ async function _aixChatGenerateContent_LL(
          model: aixModel,
          chatGenerate: aixChatGenerate,
          context: aixContext,
-          streaming: getAixDebuggerNoStreaming() ? false : aixStreaming, // [DEV] disable streaming if set in the UX (testing)
+          streaming: wireStreaming,
          connectionOptions: aixConnectionOptions,
        }, { signal: abortSignal })

@@ -912,7 +925,7 @@ async function _aixChatGenerateContent_LL(
          aixAccess,
          accumulator_LL.generator.upstreamHandle,
          aixContext,
-          true, // streaming - reattach is only validated for streaming for now
+          wireStreaming,
          aixConnectionOptions,
          abortSignal,
        ) :
@@ -921,7 +934,7 @@ async function _aixChatGenerateContent_LL(
          access: aixAccess,
          upstreamHandle: accumulator_LL.generator.upstreamHandle,
          context: aixContext,
-          streaming: true,
+          streaming: wireStreaming,
          connectionOptions: aixConnectionOptions,
        }, { signal: abortSignal })

@@ -7,6 +7,7 @@ import { Box, Card, Chip, Divider, Sheet, Typography } from '@mui/joy';
 import { RenderCodeMemo } from '~/modules/blocks/code/RenderCode';

 import { ExpanderControlledBox } from '~/common/components/ExpanderControlledBox';
+import { objectDeepCloneWithStringLimit } from '~/common/util/objectUtils';
 import TimelapseIcon from '@mui/icons-material/Timelapse';

 import type { AixClientDebugger } from './memstore-aix-client-debugger';
@@ -184,12 +185,10 @@ export function AixDebuggerFrame(props: {
          {/* List of particles */}
          {frame.particles.map((particle, idx) => {

-            // truncated preview of particle content
+            // preview of particle content: preserve structure, trim long string fields
            let jsonPreview = '';
            try {
-              const content = particle.content;
-              jsonPreview = JSON.stringify(content).substring(0, 1024);
-              if (jsonPreview.length >= 1024) jsonPreview += '...';
+              jsonPreview = JSON.stringify(objectDeepCloneWithStringLimit(particle.content, 'aix-debugger-particle', 64));
            } catch (e) {
              jsonPreview = 'Error parsing content';
            }
@@ -30,7 +30,7 @@ export const aixRouter = createTRPCRouter({
      const _d = _createDebugConfig(input.access, input.connectionOptions, input.context.name);
      const dispatchCreator = () => createChatGenerateDispatch(input.access, input.model, input.chatGenerate, input.streaming, !!input.connectionOptions?.enableResumability);

-      yield* executeChatGenerateWithContinuation(dispatchCreator, input.streaming, ctx.reqSignal, _d);
+      yield* executeChatGenerateWithContinuation(dispatchCreator, ctx.reqSignal, _d);
    }),

  /**
@@ -42,14 +42,14 @@ export const aixRouter = createTRPCRouter({
      access: AixWire_API.Access_schema,
      upstreamHandle: AixWire_API.UpstreamHandle_schema, // reattach uses a handle instead of 'model + chatGenerate'
      context: AixWire_API.ContextChatGenerate_schema,
-      streaming: z.literal(true), // reattach is always streaming
+      streaming: z.boolean(),
      connectionOptions: AixWire_API.ConnectionOptionsChatGenerate_schema.pick({ debugDispatchRequest: true }).optional(), // debugDispatchRequest
    }))
    .mutation(async function* ({ input, ctx }) {
      const _d = _createDebugConfig(input.access, input.connectionOptions, input.context.name);
      const dispatchCreator = () => createChatGenerateResumeDispatch(input.access, input.upstreamHandle, input.streaming);

-      yield* executeChatGenerateWithContinuation(dispatchCreator, input.streaming, ctx.reqSignal, _d);
+      yield* executeChatGenerateWithContinuation(dispatchCreator, ctx.reqSignal, _d);
    }),

  /**
@@ -104,11 +104,20 @@ export namespace AixWire_Parts {
      openai: z.object({
        // Responses API reasoning item continuity handle. Sub-object mirrors the shape of the source output item
        // and parallels _vnd Anthropic's { container: { id, expiresAt } } pattern.
+        // IMPORTANT: this blob is OpenAI-server-encrypted; do NOT round-trip to xAI (different keys + private item ids).
        reasoningItem: z.object({
          id: z.string().optional(),               // rs_... - item id
          encryptedContent: z.string().optional(), // blob returned when include:['reasoning.encrypted_content']
        }).optional(),
      }).optional(),
+      xai: z.object({
+        // xAI Responses API reasoning item continuity handle. Same WIRE shape as OpenAI's, but the encrypted_content
+        // is encrypted with xAI's keys and the item id references xAI server state - NOT cross-portable to OpenAI.
+        reasoningItem: z.object({
+          id: z.string().optional(),
+          encryptedContent: z.string().optional(),
+        }).optional(),
+      }).optional(),
      // NOTE: we do NOT use this mechanism for per-vendor customization/ALT for parts
      // anthropic: z.object({
      //   containerUpload: z.object({
@@ -507,6 +516,7 @@ export namespace AixWire_API {

    // Gemini
    vndGeminiAPI: z.enum(['interactions-agent']).optional(), // opt-in per-model API dialect; unset = generateContent
+    vndGeminiAgentViz: z.enum(['auto', 'off']).optional(), // agent_config.visualization; default 'auto' upstream
    vndGeminiAspectRatio: z.enum(['1:1', '2:3', '3:2', '3:4', '4:3', '9:16', '16:9', '21:9']).optional(),
    vndGeminiCodeExecution: z.enum(['auto']).optional(),
    vndGeminiComputerUse: z.enum(['browser']).optional(),
@@ -689,7 +699,7 @@ export namespace AixWire_Particles {

  export type ChatControlOp =
  // | { cg: 'start' } // not really used for now
-    | { cg: 'end', terminationReason: CGEndReason /* we know why we're sending 'end' */, tokenStopReason?: GCTokenStopReason /* we may or not have gotten a logical token stop reason from the dispatch */ }
+    | { cg: 'end', terminationReason: CGEndReason /* we know why we're sending 'end' */, tokenStopReason?: GCTokenStopReason /* we may or not have gotten a logical token stop reason from the dispatch */, tokenStopError?: string /* optional vendor-composed human-readable detail paired with tokenStopReason */ }
    | { cg: 'issue', issueId: CGIssueId, issueText: string }
    | { cg: 'aix-info', ait: 'flow-cont' /* important: establishes a checkpoint */, text: string }
    | { cg: 'aix-retry-reset', rScope: 'srv-dispatch' | 'srv-op' | 'cli-ll', rClearStrategy: 'none' | 'since-checkpoint' | 'all', reason: string, attempt: number, maxAttempts: number, delayMs: number, causeHttp?: number, causeConn?: string }
@@ -786,6 +796,7 @@ export namespace AixWire_Particles {
      | { vendor: 'anthropic', state: { container: { id: string; expiresAt: string } } } // message-level
      | { vendor: 'gemini', state: { thoughtSignature: string } } // fragment-level
      | { vendor: 'openai', state: { reasoningItem: { id?: string, encryptedContent?: string } } } // fragment-level (attach to ma reasoning fragment)
+      | { vendor: 'xai', state: { reasoningItem: { id?: string, encryptedContent?: string } } } // fragment-level - DISTINCT from openai (different encryption keys, different server-side ids)
      // | { vendor: string, state: Record<string, unknown> } // disable catch-all becasue it forces casts in type discriminations
      )
    ;
@@ -56,6 +56,7 @@ export class ChatGenerateTransmitter implements IParticleTransmitter {

  // Token stop reason
  private tokenStopReason: AixWire_Particles.GCTokenStopReason | undefined = undefined;
+  private tokenStopError: string | undefined = undefined;

  // Metrics
  private accMetrics: AixWire_Particles.CGSelectMetrics | undefined = undefined;
@@ -105,6 +106,7 @@ export class ChatGenerateTransmitter implements IParticleTransmitter {
        cg: 'end',
        terminationReason: this.terminationReason,
        tokenStopReason: this.tokenStopReason, // See NOTE above - || (dispatchOrDialectIssue ? 'cg-issue' : 'ok'),
+        ...(this.tokenStopError && { tokenStopError: this.tokenStopError }),
      });
      // Keep this in a terminated state, so that every subsequent call will yield errors (not implemented)
      // this.terminationReason = null;
@@ -201,12 +203,13 @@ export class ChatGenerateTransmitter implements IParticleTransmitter {
    this.setDialectEnded('issue-dialect');
  }

-  setTokenStopReason(reason: AixWire_Particles.GCTokenStopReason) {
+  setTokenStopReason(reason: AixWire_Particles.GCTokenStopReason, errorText?: string) {
    if (SERVER_DEBUG_WIRE)
-      console.log('|token-stop|', reason);
+      console.log('|token-stop|', reason, errorText ?? '');
    if (this.tokenStopReason && this.tokenStopReason !== reason)
      console.warn(`[Aix.${this.prettyDialect}] setTokenStopReason('${reason}'): already has token stop reason '${this.tokenStopReason}' (overriding)`);
    this.tokenStopReason = reason;
+    if (errorText) this.tokenStopError = errorText;
  }


@@ -35,7 +35,7 @@ export function aixAnthropicHostedFeatures(model: AixAPI_Model, chatGenerate: Ai
  const _hasAixToolRestrictivePolicy = chatGenerate.toolsPolicy?.type === 'any' || chatGenerate.toolsPolicy?.type === 'function_call';

  // Dynamic web tools (20260209) require code execution for programmatic tool calling
-  const hasDynamicWebTools = model.vndAntWebDynamic === true && (model.vndAntWebSearch === 'auto' || model.vndAntWebFetch === 'auto');
+  // const hasDynamicWebTools = model.vndAntWebDynamic === true && (model.vndAntWebSearch === 'auto' || model.vndAntWebFetch === 'auto');

  // Programmatic Tool Calling - tools with allowed_callers or input_examples
  const programmaticToolCalling = chatGenerate.tools?.some(tool =>
@@ -45,10 +45,17 @@ export function aixAnthropicHostedFeatures(model: AixAPI_Model, chatGenerate: Ai
    ),
  ) ?? false;

+  // [Anthropic, issue #1087] Dynamic web tools (20260209) have INTERNAL code execution. We do not
+  // explicitly add the code_execution tool nor the beta header for them: Anthropic enables what is
+  // needed implicitly behind the scenes.
  return {
    disableAllHostedTools: !!(_hasAixCustomTools && _hasAixToolRestrictivePolicy),
    enable1MContext: model.vndAnt1MContext === true,
-    enableCodeExecution: !!model.vndAntSkills || !!model.vndAntContainerId || hasDynamicWebTools || programmaticToolCalling,
+    enableCodeExecution:
+      !!model.vndAntSkills ||
+      // || hasDynamicWebTools // https://platform.claude.com/docs/en/agents-and-tools/tool-use/server-tools#dynamic-filtering-with-code-execution
+      // || !!model.vndAntContainerId // do not re-enable code execution jsut for continuity - would have parasitic effects: https://github.com/enricoros/big-AGI/issues/1087#issuecomment-4340352958
+      programmaticToolCalling,
    enableFastMode: model.vndAntInfSpeed === 'fast',
    enableSkills: !!model.vndAntSkills,
    enableStrictOutputs: !!model.strictJsonOutput || !!model.strictToolInvocations,
@@ -284,7 +291,9 @@ export function aixToAnthropicMessageCreate(model: AixAPI_Model, _chatGenerate:
        name: 'tool_search_tool_bm25',
      });

-    // Code Execution tool - required for dynamic filtering, Skills, etc.
+    // Code Execution tool - for Skills, container reuse, and Programmatic Tool Calling.
+    // Note: NOT added for dynamic web tools (_20260209) - they execute code internally and adding
+    // a standalone environment confuses the model (issue #1087).
    if (enableCodeExecution)
      hostedTools.push({ type: 'code_execution_20260120', name: 'code_execution' });

@@ -415,8 +424,10 @@ function* _generateAnthropicMessagesContentBlocks({ parts, role }: AixMessages_C
            break;

          case 'ma':
-            if (!part.aText && !part.textSignature && !part.redactedData)
-              throw new Error('Extended Thinking data is missing');
+            if (!part.aText && !part.textSignature && !part.redactedData) {
+              console.warn('Anthropic: broken empty thinking block', { part });
+              break;
+            }
            if (part.aText && part.textSignature)
              yield { role: 'assistant', content: AnthropicWire_Blocks.ThinkingBlock(part.aText, part.textSignature) };
            for (const redactedData of part.redactedData || [])
@@ -86,7 +86,8 @@ export function aixToGeminiInteractionsCreate(model: AixAPI_Model, chatGenerateR
      agent_config: {
        type: 'deep-research',
        thinking_summaries: 'auto', // Enable thought_summary blocks - without this the API would not emit summaries during streaming
-        // visualization defaults to 'auto' upstream; leave unset to keep the default (agent may generate charts/images).
+        // visualization: forwarded only when the client explicitly opts out; 'auto' (default) is left unset so the agent may generate charts/images.
+        ...(model.vndGeminiAgentViz === 'off' && { visualization: 'off' }),
      },
    }),
    // non-DR agents: use native system_instruction field (matches gemini.generateContent.ts convention)
@@ -37,6 +37,7 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
  const chatGenerate = aixSpillSystemToUser(_chatGenerate);

  // Dialect incompatibilities -> Hotfixes
+  // [DeepSeek, 2026-04-24] V4 doesn't require strict alternation but we keep coalescing for cleanliness; the reducer only merges assistant/user, tool messages stay separate (parallel tool_calls).
  const hotFixAlternateUserAssistantRoles = openAIDialect === 'deepseek' || openAIDialect === 'perplexity';
  const hotFixRemoveEmptyMessages = openAIDialect === 'moonshot' || openAIDialect === 'perplexity'; // [Moonshot, 2026-02-10] consecutive assistant messages (empty + content) break Moonshot - coalesce to fix
  const hotFixRemoveStreamOptions = openAIDialect === 'azure' || openAIDialect === 'mistral';
@@ -59,7 +60,7 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
    throw new Error('This service does not support function calls');

  // Convert the chat messages to the OpenAI 4-Messages format
-  let chatMessages = _toOpenAIMessages(chatGenerate.systemMessage, chatGenerate.chatSequence, hotFixOpenAIOFamily);
+  let chatMessages = _toOpenAIMessages(openAIDialect, chatGenerate.systemMessage, chatGenerate.chatSequence, hotFixOpenAIOFamily);

  // Apply hotfixes

@@ -69,6 +70,13 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
  if (hotFixAlternateUserAssistantRoles)
    chatMessages = _fixAlternateUserAssistantRoles(chatMessages);

+  // [DeepSeek, 2026-04-24] When tools are present and thinking isn't disabled, V4 demands reasoning_content on EVERY assistant message in history
+  // Inject '' placeholder where missing; real reasoning is attached by _toOpenAIMessages
+  if (openAIDialect === 'deepseek' && chatGenerate.tools?.length)
+    for (const m of chatMessages)
+      if (m.role === 'assistant' && m.reasoning_content === undefined)
+        m.reasoning_content = '';
+

  // constrained output modes - both JSON and tool invocations
  // const strictJsonOutput = !!model.strictJsonOutput;
@@ -145,18 +153,23 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
    && openAIDialect !== 'deepseek' && openAIDialect !== 'moonshot' && openAIDialect !== 'zai' // MoonShot maps to none->disabled / high->enabled
    && openAIDialect !== 'perplexity' // Perplexity has its own block below with stricter validation
  ) {
-    if (reasoningEffort === 'max') // domain validation
-      throw new Error(`OpenAI ChatCompletions API does not support '${reasoningEffort}' reasoning effort`);
+    // for: 'alibaba' | 'azure' | 'groq' | 'lmstudio' | 'localai' | 'mistral' | 'openai' | 'openpipe' | 'togetherai' | 'xai'
    payload.reasoning_effort = reasoningEffort;
  }

  // [Moonshot] Kimi K2.5 reasoning effort -> thinking mode (only 'none' and 'high' supported for now)
  // [Z.ai] GLM thinking mode: binary enabled/disabled (supports GLM-4.5 series and higher) - https://docs.z.ai/guides/capabilities/thinking-mode
+  // [DeepSeek, 2026-04-23] V4 thinking control https://api-docs.deepseek.com/guides/thinking_mode
  if (reasoningEffort && (openAIDialect === 'deepseek' || openAIDialect === 'moonshot' || openAIDialect === 'zai')) {
-    if (reasoningEffort !== 'none' && reasoningEffort !== 'high') // domain validation
-      throw new Error(`${openAIDialect} only supports reasoning effort 'none' or 'high', got '${reasoningEffort}'`);
+    const allowedEffort = openAIDialect === 'deepseek' ? ['none', 'high', 'max'] : ['none', 'high'];
+    if (!allowedEffort.includes(reasoningEffort)) // domain validation
+      throw new Error(`${openAIDialect} only supports reasoning effort ${allowedEffort.join(', ')}, got '${reasoningEffort}'`);

-    payload.thinking = { type: reasoningEffort === 'none' ? 'disabled' : 'enabled' };
+    payload.thinking = { type: reasoningEffort !== 'none' ? 'enabled' : 'disabled' };
+
+    // [DeepSeek, 2026-04-23] DeepSeek also supports effort control for reasoning-enabled requests - set it here as it was carved from the reasoningEffort setter before
+    if (openAIDialect === 'deepseek' && reasoningEffort !== 'none')
+      payload.reasoning_effort = reasoningEffort;
  }


@@ -348,19 +361,23 @@ function _fixAlternateUserAssistantRoles(chatMessages: TRequestMessages): TReque
      };
    }

-    // if the current item has the same role as the last item, concatenate their content
+    // If current item has the same role as the last, coalesce ONLY assistant/user.
+    // Tool/system/developer must stay separate - tool messages each pair with a tool_call_id; merging corrupts the protocol.
    if (acc.length > 0) {
      const lastItem = acc[acc.length - 1];
      if (lastItem.role === historyItem.role) {
        if (lastItem.role === 'assistant') {
          lastItem.content += hotFixSquashTextSeparator + historyItem.content;
-        } else if (lastItem.role === 'user') {
+          return acc;
+        }
+        if (lastItem.role === 'user') {
          lastItem.content = [
            ...(Array.isArray(lastItem.content) ? lastItem.content : [OpenAIWire_ContentParts.TextContentPart(lastItem.content)]),
            ...(Array.isArray(historyItem.content) ? historyItem.content : historyItem.content ? [OpenAIWire_ContentParts.TextContentPart(historyItem.content)] : []),
          ];
+          return acc;
        }
-        return acc;
+        // fall through to push for tool/system/developer - each stays its own message
      }
    }

@@ -442,7 +459,10 @@ function _fixVndOaiRestoreMarkdown_Inline(payload: TRequest) {
 }*/


-function _toOpenAIMessages(systemMessage: AixMessages_SystemMessage | null, chatSequence: AixMessages_ChatMessage[], hotFixOpenAIo1Family: boolean): TRequestMessages {
+function _toOpenAIMessages(openAIDialect: OpenAIDialects, systemMessage: AixMessages_SystemMessage | null, chatSequence: AixMessages_ChatMessage[], hotFixOpenAIo1Family: boolean): TRequestMessages {
+
+  // [DeepSeek, 2026-04-24] V4 thinking-by-default - reasoning_content must round-trip on tool-call turns; payload is the 'ma' part's aText (unlike Gemini/OpenAI-Responses which carry opaque handles).
+  const echoDeepseekReasoning = openAIDialect === 'deepseek';

  // Transform the chat messages into OpenAI's format (an array of 'system', 'user', 'assistant', and 'tool' messages)
  const chatMessages: TRequestMessages = [];
@@ -555,6 +575,8 @@ function _toOpenAIMessages(systemMessage: AixMessages_SystemMessage | null, chat
        break;

      case 'model':
+        // Accumulate 'ma' reasoning text across this turn; echoed below onto the assistant message if it carries tool_calls (DeepSeek only).
+        let pendingReasoningText = '';
        for (const part of parts) {
          const currentMessage = chatMessages[chatMessages.length - 1];
          switch (part.pt) {
@@ -630,7 +652,9 @@ function _toOpenAIMessages(systemMessage: AixMessages_SystemMessage | null, chat
              break;

            case 'ma':
-              // ignore this thinking block - Anthropic only
+              // [DeepSeek only] accumulate reasoning text for the echo-back below. Other dialects ignore 'ma' (reasoning continuity flows via _vnd opaque handles, not via this adapter).
+              if (echoDeepseekReasoning && part.aType === 'reasoning' && part.aText)
+                pendingReasoningText += part.aText;
              break;

            case 'tool_response':
@@ -651,6 +675,18 @@ function _toOpenAIMessages(systemMessage: AixMessages_SystemMessage | null, chat
          }

        }
+
+        // [DeepSeek] attach accumulated reasoning to this turn's assistant message only if it carries tool_calls; plain-text turns don't need the echo per docs.
+        if (echoDeepseekReasoning && pendingReasoningText) {
+          for (let i = chatMessages.length - 1; i >= 0; i--) {
+            const m = chatMessages[i];
+            if (m.role !== 'assistant') continue;
+            if (m.tool_calls?.length)
+              m.reasoning_content = pendingReasoningText;
+            break; // stop at the most recent assistant message from this turn
+          }
+        }
+
        break;
    }
  }
@@ -495,12 +495,15 @@ function _toOpenAIResponsesRequestInput(systemMessage: AixMessages_SystemMessage

            case 'ma':
              // Preserve reasoning continuity across turns via _vnd.openai.reasoningItem (set by openai.responses.parser).
-              // Stateless (store=false, our default): encryptedContent is the protocol-critical blob for the provider to reconstruct internal reasoning state.
+              // Round-trip ONLY when both encrypted_content AND id are present (canonical, complete handle).
+              // - bare id without EC -> 404 "Item with id rs_... not found" in stateless mode
+              // - bare EC without id -> torn handle, undefined behavior across providers/versions
+              // Defense-in-depth: matches the parser's capture gate; rejects torn handles even if any sneak through.
+              // ma fragments without an openai handle are common (e.g., DeepSeek reasoning_content emits ma fragments
+              // with no continuity blob) - skip without warning to avoid log noise on cross-vendor history.
              const oaiReasoning = modelPart._vnd?.openai?.reasoningItem;
-              if (oaiReasoning?.encryptedContent || oaiReasoning?.id)
+              if (oaiReasoning?.encryptedContent && oaiReasoning?.id)
                newReasoningMessage(oaiReasoning.id, oaiReasoning.encryptedContent);
-              else
-                console.warn('[DEV] OpenAI Responses: skipping reasoning item due to missing encrypted content and id', { modelPart });
              break;

            case 'tool_response':
@@ -8,7 +8,7 @@ import { aixSpillShallFlush, aixSpillSystemToUser, approxDocPart_To_String } fro


 // configuration
-const AIX_XAI_ADD_ENCRYPTED_REASONING = false;
+const AIX_XAI_ADD_ENCRYPTED_REASONING = true;
 // const AIX_XAI_ADD_INLINE_CITATIONS = true; // yes but we don't know how yet


@@ -99,13 +99,13 @@ export function aixToXAIResponses(
  if (reasoningEffort === 'none' || reasoningEffort === 'minimal' || reasoningEffort === 'xhigh' || reasoningEffort === 'max') // domain validation
    throw new Error(`XAI Responses API does not support reasoning effort '${reasoningEffort}'`);

-  if (reasoningEffort) {
-    payload.reasoning = {
-      effort: reasoningEffort,
-      // generate_summary: unsupported
-      // summary: unsupported, defaults to 'detailed'
-    };
-  }
+  // Always request detailed reasoning summaries - grok-4.3 and others have always-on reasoning
+  // but only return summary text when explicitly requested. Also set effort when configured
+  // (only grok-4.20-multi-agent supports effort).
+  payload.reasoning = {
+    ...(reasoningEffort ? { effort: reasoningEffort } : {}),
+    summary: 'detailed',
+  };

  // Add include options for reasoning and specialized for tool sources
  if (AIX_XAI_ADD_ENCRYPTED_REASONING)
@@ -329,12 +329,15 @@ function _toXAIResponsesInput(
              break;

            case 'ma':
-              // xAI reuses the OpenAI Responses continuity namespace (_vnd.openai.reasoningItem).
-              // Only active when AIX_XAI_ADD_ENCRYPTED_REASONING is enabled and encrypted_content is captured;
-              // otherwise the handle is absent and we skip to avoid "Item with id rs_... not found" style errors.
-              const oaiReasoning = part._vnd?.openai?.reasoningItem;
-              if (oaiReasoning?.encryptedContent || oaiReasoning?.id)
-                newReasoningItem(oaiReasoning.id, oaiReasoning.encryptedContent);
+              // xAI uses its OWN _vnd namespace - the wire schema mirrors OpenAI's, but encrypted_content is
+              // encrypted with xAI-private keys and the rs_... id references xAI-private server state. Crossing
+              // these (e.g., replaying an OpenAI handle to xAI or vice versa) yields "Item with id rs_... not
+              // found" or silent reasoning corruption.
+              // Round-trip ONLY when both encrypted_content AND id are present (canonical, complete handle).
+              // Defense-in-depth: matches the parser's capture gate; rejects torn handles even if any sneak through.
+              const xaiReasoning = part._vnd?.xai?.reasoningItem;
+              if (xaiReasoning?.encryptedContent && xaiReasoning?.id)
+                newReasoningItem(xaiReasoning.id, xaiReasoning.encryptedContent);
              break;

            case 'tool_response':
@@ -55,7 +55,6 @@ export class DispatchContinuationSignal extends Error {
 */
 export async function* executeChatGenerateWithContinuation(
  dispatchCreatorFn: () => Promise<ChatGenerateDispatch>,
-  streaming: boolean,
  abortSignal: AbortSignal,
  _d: AixDebugObject,
 ): AsyncGenerator<AixWire_Particles.ChatGenerateOp, void> {
@@ -65,7 +64,7 @@ export async function* executeChatGenerateWithContinuation(
  for (let turn = 0; turn <= MAX_CONTINUATION_TURNS; turn++) {
    try {

-      yield* executeChatGenerateWithOperationRetry(currentCreator, streaming, abortSignal, _d);
+      yield* executeChatGenerateWithOperationRetry(currentCreator, abortSignal, _d);
      return; // normal completion

    } catch (error) {
@@ -25,7 +25,7 @@ import { createAnthropicFileInlineTransform } from './parsers/anthropic.transfor
 import { createAnthropicMessageParser, createAnthropicMessageParserNS } from './parsers/anthropic.parser';
 import { createBedrockConverseParserNS, createBedrockConverseStreamParser } from './parsers/bedrock-converse.parser';
 import { createGeminiGenerateContentResponseParser } from './parsers/gemini.parser';
-import { createGeminiInteractionsParser } from './parsers/gemini.interactions.parser';
+import { createGeminiInteractionsParserNS, createGeminiInteractionsParserSSE } from './parsers/gemini.interactions.parser';
 import { createOpenAIChatCompletionsChunkParser, createOpenAIChatCompletionsParserNS } from './parsers/openai.parser';
 import { createOpenAIResponseParserNS, createOpenAIResponsesEventParser } from './parsers/openai.responses.parser';

@@ -37,7 +37,8 @@ export type ChatGenerateDispatch = {
  /** Used by dialects that need multi-step I/O. The returned response is consumed normally via demuxerFormat/chatGenerateParse */
  customConnect?: (signal: AbortSignal) => Promise<Response>;
  bodyTransform?: AixDemuxers.StreamBodyTransform;
-  demuxerFormat: AixDemuxers.StreamDemuxerFormat;
+  /** Source of truth for the consumer mode: null = NS */
+  demuxerFormat: null | AixDemuxers.StreamDemuxerFormat;
  chatGenerateParse: ChatGenerateParseFunction;
  particleTransform?: ChatGenerateParticleTransformFunction;
 };
@@ -173,6 +174,7 @@ export async function createChatGenerateDispatch(access: AixAPI_Access, model: A

      // [Gemini Interactions API - ALPHA TEST] SSE-native: POST with stream=true, upstream returns event-stream we pipe through the fast-sse demuxer.
      if (model.vndGeminiAPI === 'interactions-agent') {
+        if (!streaming) console.warn(`[DEV] Gemini Interactions API - only supported in SSE mode, ignoring streaming=false for model ${model.id}`);
        const request: ChatGenerateDispatchRequest = {
          ...geminiAccess(access, null, GeminiInteractionsWire_API_Interactions.postPath, false),
          method: 'POST',
@@ -186,8 +188,9 @@ export async function createChatGenerateDispatch(access: AixAPI_Access, model: A
              if (signal.aborted) throw error; // preserve abort identity for the executor's abort classifier
              throw new Error(`Gemini Interactions POST: ${error?.message || 'upstream error'}`); // rewrapping TRPCFetcherError as plain Error makes the retrier treat it as non-retryable
            }),
+          /** Upstream hardcodes stream=true + background=true (required by deep-research agents) and has no non-streaming alternative. */
          demuxerFormat: 'fast-sse',
-          chatGenerateParse: createGeminiInteractionsParser(requestedModelName),
+          chatGenerateParse: createGeminiInteractionsParserSSE(requestedModelName),
        };
      }

@@ -244,9 +247,9 @@ export async function createChatGenerateDispatch(access: AixAPI_Access, model: A
    case 'zai':

      // newer: OpenAI Responses API, for models that support it and all XAI models
-      const isResponsesAPI = !!model.vndOaiResponsesAPI;
      const isXAIModel = dialect === 'xai'; // All XAI models are accessed via Responses now
-      if (isResponsesAPI || isXAIModel) {
+      const isResponsesAPI = !!model.vndOaiResponsesAPI || isXAIModel;
+      if (isResponsesAPI) {
        return {
          request: {
            ...openAIAccess(access, model.id, OPENAI_API_PATHS.responses),
@@ -261,11 +264,17 @@ export async function createChatGenerateDispatch(access: AixAPI_Access, model: A
             *
             * Note: Response format is compatible with OpenAI parser.
             */
-            body: isXAIModel ? aixToXAIResponses(model, chatGenerate, streaming, enableResumability)
+            body: isXAIModel
+              ? aixToXAIResponses(model, chatGenerate, streaming, enableResumability)
              : aixToOpenAIResponses(dialect, model, chatGenerate, streaming, enableResumability),
          },
          demuxerFormat: streaming ? 'fast-sse' : null,
-          chatGenerateParse: streaming ? createOpenAIResponsesEventParser() : createOpenAIResponseParserNS(),
+          // IMPORTANT: tag the parser with the actual vendor so reasoning continuity blobs
+          // (encrypted_content + rs_... id) land in the matching _vnd namespace and never leak
+          // across providers (different keys + different server-side state).
+          chatGenerateParse: streaming
+            ? createOpenAIResponsesEventParser(isXAIModel ? 'xai' : 'openai')
+            : createOpenAIResponseParserNS(isXAIModel ? 'xai' : 'openai'),
        };
      }

@@ -316,18 +325,20 @@ export async function createChatGenerateResumeDispatch(access: AixAPI_Access, re
      return {
        request: { url: `${url}?${queryParams.toString()}`, method: 'GET', headers },
        demuxerFormat: streaming ? 'fast-sse' : null,
-        chatGenerateParse: streaming ? createOpenAIResponsesEventParser() : createOpenAIResponseParserNS(),
+        chatGenerateParse: streaming ? createOpenAIResponsesEventParser('openai') : createOpenAIResponseParserNS('openai'),
      };

    case 'gemini': {
-      // [Gemini Interactions] Reattach via SSE stream - GET /interactions/{id}?stream=true replays all events from the start (intentional - client's ContentReassembler replaces message content on reattach; partial resume via last_event_id is deliberately NOT used).
+      // [Gemini Interactions] Reattach: SSE replay (?stream=true) or JSON snapshot (no query). See kb/modules/LLM-gemini-interactions.md.
      if (resumeHandle.uht !== 'vnd.gem.interactions')
        throw new Error(`Resume handle mismatch for gemini: expected 'vnd.gem.interactions', got '${resumeHandle.uht}'`);
      const { url: _baseUrl, headers: _headers } = geminiAccess(access, null, GeminiInteractionsWire_API_Interactions.getPath(resumeHandle.runId /* Gemini interaction.id */), false);
      return {
-        request: { url: `${_baseUrl}${_baseUrl.includes('?') ? '&' : '?'}stream=true`, method: 'GET', headers: _headers },
-        demuxerFormat: 'fast-sse',
-        chatGenerateParse: createGeminiInteractionsParser(null /* model name unknown at resume time - caller's DMessage already has it */),
+        request: { url: streaming ? `${_baseUrl}${_baseUrl.includes('?') ? '&' : '?'}stream=true` : _baseUrl, method: 'GET', headers: _headers },
+        demuxerFormat: streaming ? 'fast-sse' : null,
+        chatGenerateParse: streaming
+          ? createGeminiInteractionsParserSSE(null /* model name unknown at resume time - caller's DMessage already has it */)
+          : createGeminiInteractionsParserNS(null),
      };
    }

@@ -393,6 +404,21 @@ export async function executeChatGenerateDelete(access: AixAPI_Access, handle: A
    case 'gemini':
      if (handle.uht !== 'vnd.gem.interactions')
        throw new Error(`Delete handle mismatch for gemini: expected 'vnd.gem.interactions', got '${handle.uht}'`);
+
+      // Gemini: cancel the background run first (stops token generation), then DELETE the stored record.
+      // The DELETE endpoint only removes the resource; it does NOT cancel an in-flight run.
+      // Cancel may 404 "Method not found" on the Developer API (API-key mode, googleapis/python-genai#1971) -
+      // we log the outcome and proceed to DELETE so local cleanup still happens.
+      const { url: cancelUrl, headers: cancelHeaders } = geminiAccess(access, null, GeminiInteractionsWire_API_Interactions.cancelPath(handle.runId), false);
+      try {
+        const cancelResp = await fetchResponseOrTRPCThrow({ url: cancelUrl, method: 'POST', body: {}, headers: cancelHeaders, signal: abortSignal, name: 'Aix.Gemini.Interactions.cancel', throwWithoutName: true });
+        console.log(`[AIX] Gemini.Interactions.cancel: ok=${cancelResp.ok} status=${cancelResp.status}`);
+      } catch (error: any) {
+        if (abortSignal.aborted) throw error;
+        const status = error instanceof TRPCFetcherError ? error.httpStatus : undefined;
+        console.log(`[AIX] Gemini.Interactions.cancel: failed status=${status ?? '?'} msg=${error?.message ?? 'unknown'}`);
+      }
+
      ({ url, headers } = geminiAccess(access, null, GeminiInteractionsWire_API_Interactions.deletePath(handle.runId), false));
      name = 'Aix.Gemini.Interactions.delete';
      break;
@@ -26,7 +26,6 @@ import { heartbeatsWhileAwaiting } from '../heartbeatsWhileAwaiting';
 */
 export async function* executeChatGenerateDispatch(
  dispatchCreatorFn: () => Promise<ChatGenerateDispatch>,
-  streaming: boolean,
  intakeAbortSignal: AbortSignal,
  _d: AixDebugObject,
  parseContext?: { retriesAvailable: boolean },
@@ -59,7 +58,7 @@ export async function* executeChatGenerateDispatch(
  const innerStream = (async function* () {

    // Consume dispatch response
-    if (!streaming)
+    if (dispatch.demuxerFormat === null /* NS */)
      yield* _consumeDispatchUnified(dispatchResponse, dispatch.chatGenerateParse, chatGenerateTx, _d, parseContext);
    else
      yield* _consumeDispatchStream(dispatchResponse, dispatch.bodyTransform ?? null, dispatch.demuxerFormat, dispatch.chatGenerateParse, chatGenerateTx, _d, parseContext);
@@ -44,7 +44,6 @@ export class OperationRetrySignal extends Error {
 */
 export async function* executeChatGenerateWithOperationRetry(
  dispatchCreatorFn: () => Promise<ChatGenerateDispatch>,
-  streaming: boolean,
  abortSignal: AbortSignal,
  _d: AixDebugObject,
 ): AsyncGenerator<AixWire_Particles.ChatGenerateOp, void> {
@@ -55,7 +54,7 @@ export async function* executeChatGenerateWithOperationRetry(
  while (true) {
    try {

-      yield* executeChatGenerateDispatch(dispatchCreatorFn, streaming, abortSignal, _d, {
+      yield* executeChatGenerateDispatch(dispatchCreatorFn, abortSignal, _d, {
        retriesAvailable: attemptNumber < maxAttempts,
      });

@@ -15,8 +15,8 @@ export interface IParticleTransmitter {
  /** End the current part and flush it, which also calls `setDialectEnded('issue-dialect')` */
  setDialectTerminatingIssue(dialectText: string, symbol: string | null, serverLog: ParticleServerLogLevel): void;

-  /** Communicates the finish reason to the client - Data only, this does not do Control, like the above */
-  setTokenStopReason(reason: AixWire_Particles.GCTokenStopReason): void;
+  /** Communicates the finish reason to the client - Data only. Optional `errorText` is a vendor-composed string rendered as a complementary error fragment alongside the generic classification message. */
+  setTokenStopReason(reason: AixWire_Particles.GCTokenStopReason, errorText?: string): void;


  // Parts data //
@@ -404,7 +404,7 @@ export function createAnthropicMessageParser(): ChatGenerateParseFunction {
        // -> Token Stop Reason
        const tokenStopReason = _fromAnthropicStopReason(delta.stop_reason, 'message_delta');
        if (tokenStopReason !== null)
-          pt.setTokenStopReason(tokenStopReason);
+          pt.setTokenStopReason(tokenStopReason, _formatAnthropicStopError(delta.stop_details));

        // NOTE: we have more fields we're not parsing yet - https://platform.claude.com/docs/en/api/typescript/messages#message_delta_usage
        if (usage?.output_tokens && messageStartTime) {
@@ -511,6 +511,7 @@ export function createAnthropicMessageParserNS(): ChatGenerateParseFunction {
      content,
      container,
      stop_reason,
+      stop_details,
      usage,
    } = AnthropicWire_API_Message_Create.Response_schema.parse(JSON.parse(fullData));

@@ -653,7 +654,7 @@ export function createAnthropicMessageParserNS(): ChatGenerateParseFunction {
    // -> Token Stop Reason (pause_turn already thrown above)
    const tokenStopReason = _fromAnthropicStopReason(stop_reason, 'parser_NS');
    if (tokenStopReason !== null)
-      pt.setTokenStopReason(tokenStopReason);
+      pt.setTokenStopReason(tokenStopReason, _formatAnthropicStopError(stop_details));
  };
 }

@@ -681,6 +682,19 @@ function _emitContainerState(pt: IParticleTransmitter, container: { id: string;
  });
 }

+/** Compose a human-readable error string from Anthropic's stop_details. Returns undefined when nothing useful to surface. */
+function _formatAnthropicStopError(stopDetails: { type: string; category?: string | null; explanation?: string | null } | null | undefined): string | undefined {
+  if (!stopDetails) return undefined;
+  if (stopDetails.type !== 'refusal') {
+    aixResilientUnknownValue('Anthropic', 'stopDetailsType', stopDetails.type);
+    return undefined;
+  }
+  const parts: string[] = [];
+  if (stopDetails.category) parts.push(`[${stopDetails.category}]`);
+  if (stopDetails.explanation) parts.push(stopDetails.explanation);
+  return parts.length ? `Refusal: ${parts.join(' ')}` : undefined;
+}
+

 // --- Shared server tool result handlers (used by both S and NS parsers) ---

@@ -5,6 +5,7 @@ import type { ChatGenerateParseFunction } from '../chatGenerate.dispatch';
 import type { IParticleTransmitter } from './IParticleTransmitter';

 import { GeminiInteractionsWire_API_Interactions } from '../../wiretypes/gemini.interactions.wiretypes';
+import { IssueSymbols } from '../ChatGenerateTransmitter';
 import { geminiConvertPCM2WAV } from './gemini.audioutils';


@@ -44,7 +45,7 @@ type BlockState = {
 * the cursor (or from start if omitted). Our parser is position-idempotent within a single run
 * because the transmitter's state carries across events.
 */
-export function createGeminiInteractionsParser(requestedModelName: string | null): ChatGenerateParseFunction {
+export function createGeminiInteractionsParserSSE(requestedModelName: string | null): ChatGenerateParseFunction {

  const parserCreationTimestamp = Date.now();
  let timeToFirstContent: number | undefined;
@@ -150,6 +151,9 @@ export function createGeminiInteractionsParser(requestedModelName: string | null
        if (!deltaParse.success) {
          // Empty deltas ({}) appear alongside placeholder blocks (e.g. internal tool slots) - silent skip
          if (event.delta && Object.keys(event.delta).length === 0) break;
+          // Known-but-not-surfaced delta types (mirrors NS parser's INTERNAL_OUTPUT_TYPES policy + spec's document/video variants we don't model) - silent skip
+          const deltaType = (event.delta as { type?: string })?.type;
+          if (deltaType && (GeminiInteractionsWire_API_Interactions.INTERNAL_OUTPUT_TYPES.has(deltaType) || deltaType === 'document' || deltaType === 'video')) break;
          console.warn('[GeminiInteractions] unknown content.delta shape at index', event.index, event.delta);
          break;
        }
@@ -218,11 +222,16 @@ export function createGeminiInteractionsParser(requestedModelName: string | null
      }

      case 'error':
-        // Observed mid-stream with an empty payload between content blocks - non-fatal, the stream
-        // continues with further events and eventually an interaction.complete. Silent-skip empty
-        // payloads (Beta noise); warn only when actual error info is present.
-        if (event.error?.message || event.error?.code)
-          console.warn('[GeminiInteractions] SSE error event:', event.error);
+        // Two observed shapes:
+        //  1) Empty payload mid-stream (Beta noise): the stream continues with further events and
+        //     eventually an interaction.complete - silent-skip.
+        //  2) Populated payload with message/code: terminal upstream error (also how Gemini reports
+        //     cancelled interactions: HTTP 500 to the cancel call + an error SSE on the stream).
+        //     Surface as a dialect-terminating issue so the UI renders it and the stream ends cleanly.
+        if (event.error?.message || event.error?.code) {
+          const errorText = `${event.error.code ? `${event.error.code}: ` : ''}${event.error.message || 'Upstream error.'}`;
+          pt.setDialectTerminatingIssue(errorText, IssueSymbols.Generic, 'srv-warn');
+        }
        break;

      default: {
@@ -235,6 +244,192 @@ export function createGeminiInteractionsParser(requestedModelName: string | null
 }


+/**
+ * Non-streaming parser: reads the GET /v1beta/interactions/{id} JSON body once and emits the same
+ * particles the SSE parser would, in a single batch.
+ *
+ * Used by the "Recover" path when SSE delivery is broken upstream (10-min cuts; see KB doc) but the
+ * resource is still fetchable. We always re-emit the upstream handle so failed/in_progress runs
+ * remain retryable; only `status: completed` clears it (via the reassembler's outcome=='completed' policy).
+ *
+ * See `kb/modules/LLM-gemini-interactions.md` for failure modes and recovery model.
+ */
+export function createGeminiInteractionsParserNS(requestedModelName: string | null): ChatGenerateParseFunction {
+
+  const parserCreationTimestamp = Date.now();
+
+  return function parse(pt: IParticleTransmitter, rawEventData: string, _eventName?: string): void {
+
+    // model name (preserved from caller's DMessage on resume; first-call only on fresh fetches)
+    if (requestedModelName != null)
+      pt.setModelName(requestedModelName);
+
+    // parse + validate against the Interaction resource schema (looseObject - tolerant to upstream additions)
+    let rawJson: unknown;
+    try {
+      rawJson = JSON.parse(rawEventData);
+    } catch (e: any) {
+      throw new Error(`malformed Interaction JSON: ${e?.message || String(e)}`);
+    }
+    const parsed = GeminiInteractionsWire_API_Interactions.Interaction_schema.safeParse(rawJson);
+    if (!parsed.success) {
+      console.warn('[GeminiInteractions-NS] unexpected Interaction shape:', rawJson);
+      throw new Error('Gemini Interactions: unexpected resource shape (no `id`/`status` fields)');
+    }
+    const interaction = parsed.data;
+
+    // upstream handle - preserve so user can retry / delete
+    pt.setUpstreamHandle(interaction.id, 'vnd.gem.interactions');
+
+    // Walk outputs in order. Each output is loose; we safeParse against KnownOutput_schema and
+    // silently skip INTERNAL_OUTPUT_TYPES (tool calls/results). Order matters - thoughts and
+    // text interleave in the report and the user reads them top-to-bottom.
+    const outputs = interaction.outputs ?? [];
+    let lastEmittedKind: 'thought' | 'text' | 'image' | 'audio' | null = null;
+    for (const rawOut of outputs) {
+      const outType = (rawOut as { type?: string })?.type;
+
+      // silent-skip internal tool-call outputs (matches SSE parser policy for INTERNAL_OUTPUT_TYPES)
+      if (outType && GeminiInteractionsWire_API_Interactions.INTERNAL_OUTPUT_TYPES.has(outType))
+        continue;
+
+      const knownOut = GeminiInteractionsWire_API_Interactions.KnownOutput_schema.safeParse(rawOut);
+      if (!knownOut.success) {
+        if (outType) console.warn('[GeminiInteractions-NS] unknown output type, skipping:', outType);
+        continue;
+      }
+
+      // emit a part boundary when switching kinds, mirrors SSE behavior on content.start across indices
+      if (lastEmittedKind !== null && lastEmittedKind !== knownOut.data.type)
+        pt.endMessagePart();
+
+      switch (knownOut.data.type) {
+        case 'thought': {
+          const summary = knownOut.data.summary;
+          if (typeof summary === 'string') {
+            if (summary) pt.appendReasoningText(summary);
+          } else if (Array.isArray(summary)) {
+            for (const item of summary)
+              if (item.text) pt.appendReasoningText(item.text);
+          }
+          if (knownOut.data.signature)
+            pt.setReasoningSignature(knownOut.data.signature);
+          lastEmittedKind = 'thought';
+          break;
+        }
+        case 'text': {
+          if (knownOut.data.text)
+            pt.appendText(knownOut.data.text);
+          // Citations: matches SSE policy - DISABLE_CITATIONS kill-switch dictates Deep Research drops them
+          if (!DISABLE_CITATIONS && knownOut.data.annotations) {
+            for (const annRaw of knownOut.data.annotations) {
+              const ann = GeminiInteractionsWire_API_Interactions.UrlCitationAnnotation_schema.safeParse(annRaw);
+              if (!ann.success) continue;
+              const a = ann.data;
+              pt.appendUrlCitation(a.title || a.url, a.url, undefined, a.start_index, a.end_index, undefined, undefined);
+            }
+          }
+          lastEmittedKind = 'text';
+          break;
+        }
+        case 'image': {
+          if (knownOut.data.data && knownOut.data.mime_type)
+            pt.appendImageInline(knownOut.data.mime_type, knownOut.data.data, 'Gemini Generated Image', 'Gemini', '', true);
+          else if (knownOut.data.uri)
+            pt.appendText(`\n[Image: ${knownOut.data.uri}]\n`);
+          lastEmittedKind = 'image';
+          break;
+        }
+        case 'audio': {
+          if (knownOut.data.data && knownOut.data.mime_type) {
+            const mime = knownOut.data.mime_type.toLowerCase();
+            const isPCM = mime.startsWith('audio/l16') || mime.includes('codec=pcm');
+            if (isPCM) {
+              try {
+                const wav = geminiConvertPCM2WAV(knownOut.data.mime_type, knownOut.data.data);
+                pt.appendAudioInline(wav.mimeType, wav.base64Data, 'Gemini Generated Audio', 'Gemini', wav.durationMs);
+              } catch (error) {
+                console.warn('[GeminiInteractions-NS] audio PCM convert failed:', error);
+              }
+            } else {
+              pt.appendAudioInline(knownOut.data.mime_type, knownOut.data.data, 'Gemini Generated Audio', 'Gemini', 0);
+            }
+          }
+          lastEmittedKind = 'audio';
+          break;
+        }
+        default: {
+          const _exhaustive: never = knownOut.data;
+          break;
+        }
+      }
+    }
+
+    // close out any open part before the terminal status emission
+    if (lastEmittedKind !== null) pt.endMessagePart();
+
+    // Terminal status -> stop reason + dialect end (mirrors _handleInteractionComplete)
+    switch (interaction.status) {
+      case 'completed':
+        _emitUsageMetrics(pt, interaction.usage, parserCreationTimestamp, undefined);
+        pt.setTokenStopReason('ok');
+        pt.setDialectEnded('done-dialect');
+        break;
+      case 'failed':
+        _emitUsageMetrics(pt, interaction.usage, parserCreationTimestamp, undefined);
+        pt.setDialectTerminatingIssue('Deep Research interaction failed', null, 'srv-warn');
+        break;
+      case 'cancelled':
+        _emitUsageMetrics(pt, interaction.usage, parserCreationTimestamp, undefined);
+        pt.setTokenStopReason('cg-issue');
+        pt.setDialectEnded('done-dialect');
+        break;
+      case 'incomplete':
+        pt.appendText('\n_Response incomplete (run stopped early)._\n');
+        _emitUsageMetrics(pt, interaction.usage, parserCreationTimestamp, undefined);
+        pt.setTokenStopReason('out-of-tokens');
+        pt.setDialectEnded('done-dialect');
+        break;
+      case 'requires_action':
+        pt.setDialectTerminatingIssue('Deep Research returned requires_action (not supported in this client)', null, 'srv-warn');
+        break;
+      case 'in_progress': {
+        // Two scenarios both surface as `in_progress`:
+        //  1) Run is genuinely live server-side (just slow) - polling later will yield content.
+        //  2) "Zombie": the generator crashed but the status never transitioned. Stays `in_progress`
+        //     for days with no outputs. Not recoverable - the only remedy is delete + retry.
+        // We can't disambiguate from one frame, so we surface {created, updated, outputs.length}
+        // and let the user decide. `tokenStopReason='cg-issue'` keeps the upstream handle alive
+        // (vs 'ok' which would clear it via the reassembler's clean-completion policy).
+        // see kb/modules/LLM-gemini-interactions.md#failure-modes (C)
+        const elapsedMin = _minutesSince(interaction.created);
+        const updatedMin = _minutesSince(interaction.updated);
+        const outCount = (interaction.outputs ?? []).length;
+        const lines: string[] = ['\n_Deep Research run is **`in_progress`** server-side._\n'];
+        if (elapsedMin != null) lines.push(`- Started: **${_humanDuration(elapsedMin)} ago**`);
+        if (updatedMin != null && updatedMin !== elapsedMin) lines.push(`- Last server update: **${_humanDuration(updatedMin)} ago**`);
+        lines.push(`- Outputs so far: **${outCount === 0 ? 'none' : outCount}**`);
+        // Heuristic threshold: stale-and-empty for >60 min is almost certainly a zombie.
+        const looksStuck = outCount === 0 && elapsedMin != null && elapsedMin > 60;
+        if (looksStuck)
+          lines.push('\nThis run looks **stuck** (no content for over an hour). Click **Cancel** to delete it and try again.');
+        else
+          lines.push('\nTry **Recover** again in a few minutes; if it stays empty, click **Cancel** to delete and retry.');
+        pt.appendText(lines.join('\n') + '\n');
+        pt.setTokenStopReason('cg-issue');
+        pt.setDialectEnded('done-dialect');
+        break;
+      }
+      default: {
+        const _exhaustiveCheck: never = interaction.status;
+        console.warn('[GeminiInteractions-NS] unreachable status', interaction.status);
+        break;
+      }
+    }
+  };
+}
+
+
 // --- helpers ---

 function _classifyContentKind(rawType: unknown): BlockState['kind'] {
@@ -364,3 +559,22 @@ function _emitUsageMetrics(

  pt.updateMetrics(m);
 }
+
+
+/** Minutes elapsed between an upstream ISO 8601 timestamp and now. Returns null on parse failure. */
+function _minutesSince(iso: string | undefined | null): number | null {
+  if (!iso) return null;
+  const ms = Date.parse(iso);
+  if (!Number.isFinite(ms)) return null;
+  return Math.max(0, (Date.now() - ms) / 60_000);
+}
+
+/** Human-readable elapsed-time string for in_progress diagnostic messages. */
+function _humanDuration(minutes: number): string {
+  if (minutes < 1) return 'less than a minute';
+  if (minutes < 60) return `${Math.round(minutes)} min`;
+  const hours = minutes / 60;
+  if (hours < 24) return `${Math.round(hours * 10) / 10} hours`;
+  const days = hours / 24;
+  return `${Math.round(days * 10) / 10} days`;
+}
@@ -494,6 +494,10 @@ export function createOpenAIChatCompletionsParserNS(): ChatGenerateParseFunction
      } else if (message.content !== undefined && message.content !== null)
        throw new Error(`unexpected message content type: ${typeof message.content}`);

+      // [DeepSeek, 2026-04-24] Non-streaming reasoning_content -> 'ma' reasoning part (mirror of streaming path above)
+      if (typeof message.reasoning_content === 'string' && message.reasoning_content)
+        pt.appendReasoningText(message.reasoning_content);
+
      // [OpenRouter, 2025-01-20] Handle structured reasoning_details
      if (Array.isArray(message.reasoning_details)) {
        for (const reasoningDetail of message.reasoning_details) {
@@ -18,6 +18,21 @@ const OPENAI_RESPONSES_SAME_PART_SPACER = '\n\n';
 const INLINE_IMAGE_SKIP_RESIZE_MAX_B64_BYTES = 250_000; // skip resize for small images (e.g. code interpreter charts)


+/**
+ * Wishlist marker: hosted tool calls (web_search_call, image_generation_call, code_interpreter_call, ...)
+ * are rendered via ephemeral OperationState/inline-asset paths and are NOT round-tripped as structured
+ * fragments. This breaks stateless multi-turn with reasoning models. See PRD.FUTURE-atol.md "Wishlist:
+ * Hosted tool invocations as first-class fragments".
+ */
+// const _hostedToolWishlistSeen = new Set<string>();
+function _hostedToolWishlistHint(family: 'web_search' | 'image_generation' | 'code_interpreter' | 'custom_tool'): void {
+  // if (_hostedToolWishlistSeen.has(family)) return;
+  // _hostedToolWishlistSeen.add(family);
+  // NOTE: disable the log because it's logging all the time evenrwyehre; just implement this
+  // console.log(`[DEV] AIX: ATOL wishlist - hosted '${family}' call observed; not round-tripped as a structured fragment yet (see kb/product/PRD.FUTURE-atol.md)`);
+}
+
+
 /**
 * Safely sanitizes a URL for display in placeholders by removing query parameters and paths
 * to prevent leaking sensitive information while keeping the domain recognizable.
@@ -46,6 +61,11 @@ type TEventType = OpenAIWire_API_Responses.StreamingEvent['type'];
 // cached config for the image_generation hosted tool, captured at response.created
 type TImageGenToolCfg = Extract<OpenAIWire_Responses_Tools.Tool, { type: 'image_generation' }>;

+/** Extract the image_generation tool config from the echoed tools array (API does not echo `model` per-item). Shared by streaming and non-streaming paths. */
+function _findImageGenToolCfg(tools: TResponse['tools']): TImageGenToolCfg | undefined {
+  return tools?.find((t): t is TImageGenToolCfg => t.type === 'image_generation');
+}
+

 /**
 * We need this just to ensure events are not out of order, as out streaming is progressive
@@ -79,6 +99,7 @@ class ResponseParserStateMachine {

  // streaming state tracking
  #hasFunctionCalls: boolean = false; // tracks if we've seen function_call output items
+  #responseSealed: boolean = false; // true once response.completed/failed/incomplete has been processed - trailing 'error' events are advisory only

  // hosted tool configuration echo (captured at response.created)
  #imageGenToolCfg: TImageGenToolCfg | undefined;
@@ -244,12 +265,19 @@ class ResponseParserStateMachine {
    return this.#hasFunctionCalls;
  }

+  markResponseSealed() {
+    this.#responseSealed = true;
+  }
+
+  get responseSealed() {
+    return this.#responseSealed;
+  }
+

  // Hosted tool config capture

  captureHostedToolConfigs(tools: TResponse['tools']) {
-    if (!tools?.length) return;
-    this.#imageGenToolCfg = tools.find((t): t is TImageGenToolCfg => t.type === 'image_generation');
+    this.#imageGenToolCfg = _findImageGenToolCfg(tools);
  }

  get imageGenToolCfg() {
@@ -261,8 +289,13 @@ class ResponseParserStateMachine {

 /**
 * OpenAI Responses API Streaming Parser
+ *
+ * @param vendor 'openai' (default) or 'xai' - tags the reasoning continuity handle so it round-trips back
+ *   to the SAME provider. The OpenAI Responses wire format is shared with xAI, but the encrypted_content blob
+ *   and the rs_... id are vendor-server-private (different keys, different state). Mixing them produces
+ *   "Item with id rs_... not found" or worse silent corruption.
 */
-export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
+export function createOpenAIResponsesEventParser(vendor: 'openai' | 'xai'): ChatGenerateParseFunction {

  const R = new ResponseParserStateMachine();

@@ -353,11 +386,13 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
        }

        // -> End of the response
+        R.markResponseSealed();
        pt.setDialectEnded('done-dialect'); // OpenAI Responses: 'response.completed'
        break;

      case 'response.failed':
        R.setResponse(eventType, event.response);
+        R.markResponseSealed();
        pt.setTokenStopReason('cg-issue'); // generic issue?
        console.warn(`[DEV] AIX: FIXME: OpenAI-Response failed ${eventType}:`, event.response);
        // TODO: extract and forward error details
@@ -366,6 +401,7 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
      case 'response.incomplete':
        // TODO: We haven't seen one of those events yet; we need to see what happens and parse it!
        R.setResponse(eventType, event.response);
+        R.markResponseSealed();

        // -> Status: handle incomplete response
        if (event.response.incomplete_details?.reason === 'max_output_tokens')
@@ -406,22 +442,28 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
            // NOTE: the authoritative encrypted_content arrives on .done (differs from the earlier .added event).
            const { id: reasoningId, encrypted_content: reasoningEC } = doneItem;

-            // [DEV] surface cases that diverge from our continuity round-trip expectations
+            // Capture ONLY when BOTH encrypted_content AND id are present (the canonical reasoning item shape).
+            // - id-only: refers to server state we don't keep in stateless mode (store: false, our default) -> 404 next turn
+            // - EC-only: a "torn" handle that breaks future stateful flows and possible id<->EC integrity checks
+            // - neither: nothing to round-trip
+            // [DEV] surface divergences from this contract
            if (!reasoningId && !reasoningEC)
-              console.warn('[DEV] AIX: OpenAI Responses: reasoning item done with neither id nor encrypted_content - no continuity handle captured for this turn', { doneItem });
+              console.warn(`[DEV] AIX: ${vendor} Responses: reasoning item done with neither id nor encrypted_content - no continuity handle captured for this turn`, { doneItem });
            else if (!reasoningEC)
-              console.log('[DEV] AIX: OpenAI Responses: reasoning item done has id but no encrypted_content - stateless round-trip requires include:[\'reasoning.encrypted_content\'] on the request');
+              console.log(`[DEV] AIX: ${vendor} Responses: reasoning item done has id but no encrypted_content - dropping handle (stateless round-trip requires include:['reasoning.encrypted_content'] on the request)`);
+            else if (!reasoningId)
+              console.log(`[DEV] AIX: ${vendor} Responses: reasoning item done has encrypted_content but no id - dropping handle (incomplete reasoning item from upstream)`);

-            if (reasoningEC || reasoningId) {
+            if (reasoningEC && reasoningId) {
              // Defensive: ensure an ma fragment exists as the attach target for the svs particle below.
              pt.appendReasoningText('');
              pt.sendSetVendorState({
                p: 'svs',
-                vendor: 'openai',
+                vendor: vendor,
                state: {
                  reasoningItem: {
-                    ...(reasoningId ? { id: reasoningId } : {}),
-                    ...(reasoningEC ? { encryptedContent: reasoningEC } : {}),
+                    id: reasoningId,
+                    encryptedContent: reasoningEC,
                  },
                },
              });
@@ -448,6 +490,7 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
            break;

          case 'image_generation_call':
+            _hostedToolWishlistHint('image_generation');
            // -> IGC: process completed image generation using 'ii' particle for inline images
            const { id: igId, result: igResult, revised_prompt: igRevisedPrompt } = doneItem;
            const igDoneText = !igRevisedPrompt?.length ? 'Image generated'
@@ -698,6 +741,14 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {
        const errorMessage = safeErrorString(event.error?.message || event?.message) ?? undefined;
        const errorParam = safeErrorString(event.error?.param || event?.param) ?? undefined;

+        // Trailing-error guard: if the response already reached a terminal state (completed/failed/incomplete),
+        // an 'error' event arriving after is an upstream advisory (e.g. rate-limit headroom) and must NOT
+        // override the prior termination - otherwise it flips the message to red and the Beam ray to 'error'.
+        if (R.responseSealed) {
+          console.warn(`[DEV] AIX: OpenAI Responses: trailing 'error' after sealed response - ignored: ${errorCode || 'Error'}: ${errorMessage || 'unknown.'}${errorParam ? ` (param: ${errorParam})` : ''}`);
+          break;
+        }
+
        // Transmit the error as text - note: throw if you want to transmit as 'error'
        // FIXME: potential point for throwing OperationRetrySignal (using 'srv-warn' for now)
        pt.setDialectTerminatingIssue(`${errorCode || 'Error'}: ${errorMessage || 'unknown.'}${errorParam ? ` (param: ${errorParam})` : ''}`, IssueSymbols.Generic, 'srv-warn');
@@ -740,8 +791,11 @@ export function createOpenAIResponsesEventParser(): ChatGenerateParseFunction {

 /**
 * OpenAI Responses API Non-Streaming Parser
+ *
+ * @param vendor 'openai' (default) or 'xai' - see createOpenAIResponsesEventParser for the rationale on
+ *   why xAI gets its own _vnd namespace (different encryption keys + private item ids).
 */
-export function createOpenAIResponseParserNS(): ChatGenerateParseFunction {
+export function createOpenAIResponseParserNS(vendor: 'openai' | 'xai'): ChatGenerateParseFunction {

  const parserCreationTimestamp = Date.now();

@@ -765,6 +819,9 @@ export function createOpenAIResponseParserNS(): ChatGenerateParseFunction {
    if (response.model)
      pt.setModelName(response.model);

+    // -> Hosted tool config capture (needed for enriching done-item particles with tool params the API does not echo per-item, e.g. image_generation.model)
+    const imageGenToolCfg = _findImageGenToolCfg(response.tools);
+
    // -> Upstream Handle (for remote control: resume, cancel, delete)
    // NOTE: we don't do it for full responses, because they're supposed to be 'complete' - i.e. no 'background' execution

@@ -875,25 +932,29 @@ export function createOpenAIResponseParserNS(): ChatGenerateParseFunction {
            pt.appendReasoningText(item.text);
          }

-          // Capture the continuity handle (encrypted_content + id) for stateless multi-turn round-tripping.
-          // Attached to the ma fragment produced by the summary above; if no summary was emitted, this may
-          // attach to an unrelated preceding fragment - tolerable as the worst case is a misfiled blob.
-          // FIXME: make sure we are attaching to an 'ma' (i.e. reasoning text or somehting was emitted)
-          if (reasoningEC || reasoningId)
+          // [DEV] surface cases that diverge from our continuity round-trip expectations (see streaming path for rationale)
+          if (!reasoningId && !reasoningEC)
+            console.warn(`[DEV] AIX: ${vendor}-Response-NS: reasoning item has neither id nor encrypted_content - no continuity handle captured for this turn`, { oItem });
+          else if (!reasoningEC)
+            console.log(`[DEV] AIX: ${vendor}-Response-NS: reasoning item has id but no encrypted_content - dropping handle (stateless round-trip requires include:['reasoning.encrypted_content'] on the request)`);
+          else if (!reasoningId)
+            console.log(`[DEV] AIX: ${vendor}-Response-NS: reasoning item has encrypted_content but no id - dropping handle (incomplete reasoning item from upstream)`);
+
+          // Capture ONLY when both id and encryptedContent are present (canonical, complete handle).
+          if (reasoningEC && reasoningId) {
+            // Defensive: ensure an ma fragment exists as the attach target for the svs particle below (parity with the streaming path).
+            pt.appendReasoningText('');
            pt.sendSetVendorState({
              p: 'svs',
-              vendor: 'openai',
+              vendor: vendor,
              state: {
                reasoningItem: {
-                  ...(reasoningId ? { id: reasoningId } : {}),
-                  ...(reasoningEC ? { encryptedContent: reasoningEC } : {}),
+                  id: reasoningId,
+                  encryptedContent: reasoningEC,
                },
              },
            });
-          else if (!reasoningId && !reasoningEC)
-            console.warn('[DEV] AIX: OpenAI-Response-NS: reasoning item has neither id nor encrypted_content - no continuity handle captured for this turn', { oItem });
-          else if (!reasoningEC)
-            console.log('[DEV] AIX: OpenAI-Response-NS: reasoning item has id but no encrypted_content - stateless round-trip requires include:[\'reasoning.encrypted_content\'] on the request');
+          }
          break;

        // Message contains the main 'assistant' response
@@ -957,6 +1018,7 @@ export function createOpenAIResponseParserNS(): ChatGenerateParseFunction {
          break;

        case 'image_generation_call':
+          _hostedToolWishlistHint('image_generation');
          // -> IGC: process completed image generation using 'ii' particle for inline images
          const { result: igResult, revised_prompt: igRevisedPrompt } = oItem;
          // Create inline image with base64 data
@@ -965,7 +1027,7 @@ export function createOpenAIResponseParserNS(): ChatGenerateParseFunction {
              _imageGenerationMimeType(oItem), // infer from output_format echoed in the item
              igResult,
              igRevisedPrompt || 'Generated image',
-              AIX_OAI_DEFAULT_IMAGE_GEN_MODEL, // generator: non-streaming path has no captured tool config, use current default
+              imageGenToolCfg?.model || AIX_OAI_DEFAULT_IMAGE_GEN_MODEL, // generator: read from echoed tools (API does not echo model per-item), fallback to current default
              igRevisedPrompt || '', // prompt used
            );
          else
@@ -1150,6 +1212,7 @@ function _imageGenerationMimeType(item: { output_format?: string }): string {
 * - citations: High-quality links (2-3) via annotations in message content
 */
 function _forwardDoneWebSearchCallItem(pt: IParticleTransmitter, webSearchCall: Extract<OpenAIWire_API_Responses.Response['output'][number], { type: 'web_search_call' }>, opId: string): void {
+  _hostedToolWishlistHint('web_search');
  const { action, status } = webSearchCall;

  const doneOpts = { opId, state: 'done' } as const;
@@ -1203,6 +1266,7 @@ function _forwardDoneWebSearchCallItem(pt: IParticleTransmitter, webSearchCall:
 * - addCodeExecutionResponse for each output result
 */
 function _forwardDoneCodeInterpreterCallItem(pt: IParticleTransmitter, codeInterpreterCall: Extract<OpenAIWire_API_Responses.Response['output'][number], { type: 'code_interpreter_call' }>): void {
+  _hostedToolWishlistHint('code_interpreter');
  const { id, code, outputs, status /*,container_id*/ } = codeInterpreterCall;

  // <- Emit code (like Gemini's executableCode)
@@ -21,7 +21,7 @@ export namespace AixDemuxers {
   * - 'fast-sse' is our own parser, optimized for performance. to be preferred when possible over 'sse' (check for full compatibility with the upstream)
   * - 'json-nl' is used by Ollama
   */
-  export type StreamDemuxerFormat = 'fast-sse' | 'json-nl' | null;
+  export type StreamDemuxerFormat = 'fast-sse' | 'json-nl';


  /**
@@ -34,8 +34,8 @@ export namespace AixDemuxers {
        return createFastEventSourceDemuxer();
      case 'json-nl':
        return _createJsonNlDemuxer();
-      case null:
-        return _nullStreamDemuxerWarn;
+      default:
+        throw new Error(`Unsupported stream demuxer format: ${format}`);
    }
  }

@@ -115,12 +115,3 @@ function _createJsonNlDemuxer(): AixDemuxers.StreamDemuxer {
    },
  };
 }
-
-
-const _nullStreamDemuxerWarn: AixDemuxers.StreamDemuxer = {
-  demux: () => {
-    console.warn('Null demuxer called - shall not happen, as it is only created in non-streaming');
-    return [];
-  },
-  flushRemaining: () => [],
-};
@@ -1,7 +1,7 @@
 <!--
  Upstream snapshot - DO NOT EDIT - run _upstream/sync.sh to refresh
  Source: https://platform.claude.com/docs/en/api/messages/create.md
-  Synced: 2026-04-23
+  Synced: 2026-04-24
  Consumed by: anthropic.wiretypes.ts, anthropic.parser.ts, anthropic.messageCreate.ts, anthropic.transform-fileInline.ts
 -->

@@ -2429,7 +2429,7 @@ Learn more about the Messages API in our [user guide](https://docs.claude.com/en

  Configuration options for the model's output, such as the output format.

-  - `effort: optional "low" or "medium" or "high" or 2 more`
+  - `effort: optional "low" or "medium" or "high" or "max"`

    All possible effort levels.

@@ -2439,8 +2439,6 @@ Learn more about the Messages API in our [user guide](https://docs.claude.com/en

    - `"high"`

-    - `"xhigh"`
-
    - `"max"`

  - `format: optional JSONOutputFormat`
@@ -3822,15 +3820,15 @@ Learn more about the Messages API in our [user guide](https://docs.claude.com/en

  Used to remove "long tail" low probability responses. [Learn more technical details here](https://towardsdatascience.com/how-to-sample-from-language-models-682bceb97277).

-  Recommended for advanced use cases only. You usually only need to use `temperature`.
+  Recommended for advanced use cases only.

 - `top_p: optional number`

  Use nucleus sampling.

-  In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by `top_p`. You should either alter `temperature` or `top_p`, but not both.
+  In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by `top_p`.

-  Recommended for advanced use cases only. You usually only need to use `temperature`.
+  Recommended for advanced use cases only.

 ### Returns

@@ -1,7 +1,7 @@
 <!--
  Upstream snapshot - DO NOT EDIT - run _upstream/sync.sh to refresh
  Source: https://ai.google.dev/gemini-api/docs/deep-research.md.txt
-  Synced: 2026-04-23
+  Synced: 2026-04-24
  Consumed by: gemini.interactions.wiretypes.ts, gemini.interactions.parser.ts, gemini.interactionsCreate.ts, gemini.interactionsPoller.ts
  Companion: ./gemini.interactions.guide.md (the Interactions API guide)
 -->
@@ -1,7 +1,7 @@
 <!--
  Upstream snapshot - DO NOT EDIT - run _upstream/sync.sh to refresh
  Source: https://ai.google.dev/gemini-api/docs/interactions.md.txt
-  Synced: 2026-04-23
+  Synced: 2026-04-24
  Consumed by: gemini.interactions.wiretypes.ts, gemini.interactions.parser.ts, gemini.interactionsCreate.ts, gemini.interactionsPoller.ts
  Companion: ./gemini.interactions.spec.md (the Interactions API reference spec), ./gemini.deep-research.guide.md (the Deep Research agent guide)
 -->
@@ -1,7 +1,7 @@
 <!--
  Upstream snapshot - DO NOT EDIT - run _upstream/sync.sh to refresh
  Source: https://ai.google.dev/api/interactions-api.md.txt
-  Synced: 2026-04-23
+  Synced: 2026-04-24
  Consumed by: gemini.interactions.wiretypes.ts, gemini.interactions.parser.ts, gemini.interactionsCreate.ts, gemini.interactionsPoller.ts
  Companion: ./gemini.interactions.guide.md (the Interactions API guide)
 -->
@@ -1,7 +1,7 @@
 <!--
  Upstream snapshot - DO NOT EDIT - run _upstream/sync.sh to refresh
  Source: https://developers.openai.com/api/reference/resources/responses/methods/create/index.md
-  Synced: 2026-04-23
+  Synced: 2026-04-24
  Consumed by: openai.wiretypes.ts, openai.responses.parser.ts, openai.responsesCreate.ts
 -->

@@ -13,6 +13,10 @@ const hotFixAntShipNoEmptyTextBlocks = true; // Replace empty text blocks with a
 *
 * ## Updates
 *
+ * ### 2026-04-24 - API Sync: stop_details for structured refusals
+ * - Response: added `stop_details` ({ type: 'refusal', category: 'cyber'|'bio'|null, explanation: string|null })
+ * - event_MessageDelta.delta: added `stop_details` (arrives alongside stop_reason in streaming)
+ *
 * ### 2026-03-21 - API Sync: GA tool versions, thinking display, caller updates, cache_control
 * - Tools: Added web_search_20260209 (GA), web_fetch_20260209/20260309 (GA), code_execution_20260120 (GA REPL)
 * - Request: Added top-level `cache_control` for automatic caching (Feb 2026)
@@ -825,6 +829,16 @@ export namespace AnthropicWire_API_Message_Create {
    'model_context_window_exceeded',
  ]);

+  /**
+   * Structured stop details, paired with stop_reason. Currently only populated when stop_reason === 'refusal'.
+   * Both `type` and `category` are loosely typed for forward-compat - parser warns on unknown `type`.
+   */
+  const StopDetails_schema = z.object({
+    type: z.enum(['refusal']).or(z.string()),
+    category: z.enum(['cyber', 'bio']).or(z.string()).nullish(),
+    explanation: z.string().nullish(),
+  });
+
  /// Request

  export type Request = z.infer<typeof Request_schema>;
@@ -1030,6 +1044,12 @@ export namespace AnthropicWire_API_Message_Create {
    // Which custom stop sequence was generated, if any.
    stop_sequence: z.string().nullable(),

+    /**
+     * Structured stop details. Present when stop_reason === 'refusal' (carries category + explanation).
+     * In streaming, stop_details is null at message_start and appears on message_delta alongside stop_reason.
+     */
+    stop_details: StopDetails_schema.nullish(),
+
    /**
     * Billing and rate-limit usage.
     * Token counts represent the underlying cost to Anthropic's systems.
@@ -1088,6 +1108,10 @@ export namespace AnthropicWire_API_Message_Create {
    delta: z.object({
      stop_reason: StopReason_schema.nullable(),
      stop_sequence: z.string().nullable(),
+      /**
+       * Structured stop details - present alongside stop_reason === 'refusal' (category + explanation).
+       */
+      stop_details: StopDetails_schema.nullish(),
      /**
       * Container state updates - present when Skills/code_execution tools are used.
       * Provides container id/expiry that may differ from message_start if the container was created mid-stream.
@@ -23,8 +23,12 @@ export namespace GeminiInteractionsWire_API_Interactions {

  export const getPath = (id: string) => `/v1beta/interactions/${encodeURIComponent(id)}`;

+  // DELETE. Removes the stored record. Orthogonal to cancel; when removed the original connection may still be running and streaming
  export const deletePath = (id: string) => `/v1beta/interactions/${encodeURIComponent(id)}`;

+  // POST. Only cancels background interactions that are still running
+  export const cancelPath = (id: string) => `/v1beta/interactions/${encodeURIComponent(id)}/cancel`;
+

  // -- Request Body (POST /v1beta/interactions) --

@@ -163,7 +167,7 @@ export namespace GeminiInteractionsWire_API_Interactions {
    // the parser prefers inline and falls back to a URI note when only `uri` is present.
    data: z.string().optional(), // base64-encoded bytes
    uri: z.string().optional(),
-    mime_type: z.string(),
+    mime_type: z.string().optional(), // spec: optional - parser still requires it before emitting inline
    resolution: z.string().optional(), // 'low' | 'medium' | 'high' | 'ultra_high'
  });

@@ -172,7 +176,7 @@ export namespace GeminiInteractionsWire_API_Interactions {
    // Per docs: data or uri, mime_type covers both PCM (audio/l16) and packaged formats (audio/wav, audio/mp3, ...).
    data: z.string().optional(),
    uri: z.string().optional(),
-    mime_type: z.string(),
+    mime_type: z.string().optional(), // spec: optional - parser still requires it before emitting inline
    rate: z.number().optional(), // sample rate, when known
    channels: z.number().optional(),
  });
@@ -189,6 +189,13 @@ export namespace OpenAIWire_Messages {
    /** [OpenRouter, 2025-01-20] Reasoning traces with multiple blocks (summary, text, encrypted). */
    reasoning_details: z.array(OpenAIWire_ContentParts.OpenRouter_ReasoningDetail_schema).optional(),

+    /**
+     * [DeepSeek, 2026-04-24] Chain-of-thought reasoning text.
+     * - Response: emitted by V4 thinking-by-default; parsed into a 'ma' reasoning part.
+     * - (this) Request: MUST be echoed back on assistant turns that carry tool_calls (otherwise HTTP 400: "The reasoning_content in the thinking mode must be passed back to the API.").
+     */
+    reasoning_content: z.string().nullable().optional(),
+
    // function_call: // ignored, as it's deprecated
    // name: _optionalParticipantName, // omitted by choice: generally unsupported
  });
@@ -331,7 +338,7 @@ export namespace OpenAIWire_API_Chat_Completions {
    stream_options: z.object({
      include_usage: z.boolean().optional(), // If set, an additional chunk will be streamed with a 'usage' field on the entire request.
    }).optional(),
-    reasoning_effort: z.enum(['none', 'minimal', 'low', 'medium', 'high', 'xhigh']).optional(), // [OpenAI, 2024-12-17] [Perplexity, 2025-06-23] reasoning effort
+    reasoning_effort: z.enum(['none', 'minimal', 'low', 'medium', 'high', 'xhigh', 'max']).optional(), // [OpenAI, 2024-12-17] [Perplexity, 2025-06-23] reasoning effort; [DeepSeek, 2026-04-23] 'max' added for V4
    // OpenAI and [OpenRouter, 2025-01-20] Verbosity parameter - maps to output_config.effort for Anthropic models
    // https://openrouter.ai/docs/api/reference/parameters#verbosity
    verbosity: z.enum([
@@ -342,7 +349,7 @@ export namespace OpenAIWire_API_Chat_Completions {
    // [OpenRouter, 2025-11-11] Unified reasoning parameter for all models
    reasoning: z.object({
      max_tokens: z.int().optional(), // Token-based control (Anthropic, Gemini): 1024-32000
-      effort: z.enum(['none', 'minimal', 'low', 'medium', 'high', 'xhigh']).optional(), // Effort-based control (OpenAI o1/o3/GPT-5, xAI, DeepSeek): allocates % of max_tokens
+      effort: z.enum(['none', 'minimal', 'low', 'medium', 'high', 'xhigh', 'max']).optional(), // Effort-based control (OpenAI o1/o3/GPT-5, xAI, DeepSeek): allocates % of max_tokens
      enabled: z.boolean().optional(), // Simple enable with medium effort defaults
      exclude: z.boolean().optional(), // Use reasoning internally without returning it in response
    }).optional(),
@@ -447,6 +454,8 @@ export namespace OpenAIWire_API_Chat_Completions {
    search_after_date_filter: z.string().optional(), // Date filter in MM/DD/YYYY format

    // [Moonshot, 2026-01-26] Kimi K2.5 thinking mode control
+    // [Z.ai, 2025-xx] GLM thinking mode: type 'enabled' | 'disabled'
+    // [DeepSeek, 2026-04-23] V4 thinking mode: same binary shape; depth is controlled via top-level `reasoning_effort`
    thinking: z.object({
      type: z.enum(['enabled', 'disabled']),
    }).optional(),
@@ -1174,9 +1183,11 @@ export namespace OpenAIWire_Responses_Items {
        // [OpenAI 2026-03-xx] DEPRECATED query might not always be present in done event
        query: z.string().optional(),
        // the output websites, if any [{"type":"url","url":"https://www.enricoros.com/"}, {"type":"url","url": "https://linkedin.com/in/enricoros/"}, ...]
+        // [OpenAI 2026-04-23, GPT-5.5] new source types: { type: 'api', name: 'oai-calculator' } for hosted-tool invocations (no url)
        sources: z.array(z.object({
-          type: z.literal('url').optional(), // source type
-          url: z.string(),
+          type: z.enum(['url', 'api']).or(z.string()).optional(), // 'url' (default) | 'api' (GPT-5.5 hosted tools) | future types
+          url: z.string().nullish(), // optional: 'api' sources have no url, only name
+          name: z.string().nullish(), // for 'api' sources (e.g., 'oai-calculator')
          // [OpenAI 2026-03-xx] not present anymore
          // title: z.string().optional(),
          // snippet: z.string().optional(),
@@ -1437,6 +1448,7 @@ export namespace OpenAIWire_Responses_Tools {
  const WebSearchTool_schema = z.object({
    type: z.enum(['web_search', 'web_search_preview', 'web_search_preview_2025_03_11']),
    search_context_size: z.enum(['low', 'medium', 'high']).optional(),
+    // [OpenAI 2026-04-23, GPT-5.5] API echoes user_location as `null` (not undefined) when unset - so .nullish()
    user_location: z.object({
      type: z.literal('approximate'),
      // API echoes these as `null` when unset, not omitted - so .nullish()
@@ -1444,7 +1456,7 @@ export namespace OpenAIWire_Responses_Tools {
      country: z.string().nullish(),
      region: z.string().nullish(),
      timezone: z.string().nullish(),
-    }).optional(),
+    }).nullish(),
    external_web_access: z.boolean().optional(),
  });

@@ -1641,7 +1653,7 @@ export namespace OpenAIWire_API_Responses {
    // NOTE: .catch() gracefully degrades to undefined since this is a non-critical enrichment path
    tools: z.array(OpenAIWire_Responses_Tools.Tool_schema).optional().catch((ctx) => {
      console.warn('[DEV] AIX: OpenAI Responses: unable to parse echoed tools, ignoring:', { tools: ctx.value });
-      return;
+      return undefined;
    }),

    output: z.array(OpenAIWire_Responses_Items.OutputItem_schema),
@@ -118,9 +118,9 @@ export namespace XAIWire_API_Responses {
    // configure reasoning
    // [2026-01-22] OBSOLETE - only grok-3-mini)(!)
    reasoning: z.object({
-      effort: z.enum([/*'none', 'minimal',*/ 'low', 'medium', 'high' /*, 'xhigh'*/]).nullish(), // XAI: 3 levels only
+      effort: z.enum([/*'none', 'minimal',*/ 'low', 'medium', 'high' /*, 'xhigh'*/]).nullish(), // only grok-4.20-multi-agent; grok-4.3 and grok-4-1-fast error if set
+      summary: z.enum(['auto', 'concise', 'detailed']).nullish(), // request reasoning summaries
      // [XAI-UNSUPPORTED] // generate_summary: z.string().nullish(),
-      // [XAI-UNSUPPORTED] // summary: z.enum(['auto', 'concise', 'detailed']).nullish(), // XAI: The model shall always return 'detailed'
    }).nullish(),

    // configure search
@@ -1,7 +1,9 @@
 import * as React from 'react';
 import { useShallow } from 'zustand/react/shallow';

-import { Alert, Box, CircularProgress } from '@mui/joy';
+import { Alert, Box, Button, CircularProgress } from '@mui/joy';
+import ContentCopyIcon from '@mui/icons-material/ContentCopy';
+import TelegramIcon from '@mui/icons-material/Telegram';

 import { ConfirmationModal } from '~/common/components/modals/ConfirmationModal';
 import { ShortcutKey, useGlobalShortcuts } from '~/common/components/shortcuts/useGlobalShortcuts';
@@ -204,13 +206,30 @@ export function BeamView(props: {
        isMobile={props.isMobile}
        rayIds={rayIds}
        showRayAdd={cardAdd}
-        showRaysOps={(isScattering || raysReady < 2) ? undefined : raysReady}
        hadImportedRays={hadImportedRays}
        onIncreaseRayCount={handleRayIncreaseCount}
-        onRaysOperation={handleRaysOperation}
        // linkedLlmId={currentGatherLlmId}
      />

+      {/* Rays Action Bar (2+ ready beams) - sibling of the grid (NOT a grid child); an in-grid spanning element with gridColumn:'1/-1' pins all auto-fit tracks open and leaves dead whitespace when raysCount < tracksCount. Fixes #1073. */}
+      {(!isScattering && raysReady >= 2) && (
+        <Box sx={{ display: 'flex', justifyContent: 'center', gap: 2, mx: 'var(--Pad)' }}>
+          <Button size='sm' variant='outlined' color='neutral' onClick={() => handleRaysOperation('copy')} endDecorator={<ContentCopyIcon sx={{ fontSize: 'md' }} />} sx={{
+            backgroundColor: 'background.surface',
+            '&:hover': { backgroundColor: 'background.popup' },
+          }}>
+            Copy {raysReady}
+          </Button>
+          <Button size='sm' variant='outlined' color='success' onClick={() => handleRaysOperation('use')} endDecorator={<TelegramIcon sx={{ fontSize: 'xl' }} />} sx={{
+            justifyContent: 'space-between',
+            backgroundColor: 'background.surface',
+            '&:hover': { backgroundColor: 'background.popup' },
+          }}>
+            Use {raysReady === 2 ? 'both' : 'all ' + raysReady} messages
+          </Button>
+        </Box>
+      )}
+

      {/* Gapper between Rays and Merge, without compromising the auto margin of the Ray Grid */}
      <Box />
@@ -246,9 +265,9 @@ export function BeamView(props: {
        onPositive={handleStartMergeConfirmation}
        // lowStakes
        noTitleBar
-        confirmationText='Some responses are still being generated. Do you want to stop and proceed with merging the available responses now?'
-        positiveActionText='Proceed with Merge'
-        negativeActionText='Wait for All Responses'
+        confirmationText={'Some replies are still generating. Merge what\'s ready?'}
+        positiveActionText='Merge now'
+        negativeActionText='Wait for all'
        negativeActionStartDecorator={
          <CircularProgress color='neutral' sx={{ '--CircularProgress-size': '24px', '--CircularProgress-trackThickness': '1px' }} />
        }
@@ -149,7 +149,8 @@ export function BeamFusionGrid(props: {
          </Box> : (
            <Typography level='body-sm' sx={{ opacity: 0.8 }}>
              {/*You need two or more replies for a {currentFactory?.shortLabel?.toLocaleLowerCase() ?? ''} merge.*/}
-              Waiting for multiple responses.
+              {/*Waiting for multiple responses.*/}
+              Merge needs 2+ replies. Beam some first.
            </Typography>
          )}
        </BeamCard>
@@ -49,7 +49,7 @@ export async function executeGatherInstruction(_i: GatherInstruction, inputs: Ex
  if (!inputs.chatMessages.length)
    throw new Error('No conversation history available');
  if (!inputs.rayMessages.length)
-    throw new Error('No responses available');
+    throw new Error('Needs two Beams at least');
  for (let rayMessage of inputs.rayMessages)
    if (rayMessage.role !== 'assistant')
      throw new Error('Invalid response role');
@@ -58,7 +58,7 @@ export function gatherStartFusion(
  if (chatMessages.length < 1)
    return onError('No conversation history available');
  if (rayMessages.length <= 1)
-    return onError('No responses available');
+    return onError('Needs two Beams at least');
  if (!initialFusion.llmId)
    return onError('No Merge model selected');

@@ -122,7 +122,7 @@ The final output should reflect a deep understanding of the user's preferences a
    addLabel: 'Add Breakdown',
    cardTitle: 'Evaluation Table',
    Icon: TableViewRoundedIcon as typeof SvgIcon,
-    description: 'Analyzes and compares AI responses, offering a structured framework to support your response choice.',
+    description: 'Analyzes and compares replies, with a structured framework to support your choice.',
    createInstructions: () => [
      {
        type: 'gather',
@@ -3,8 +3,6 @@ import * as React from 'react';
 import type { SxProps } from '@mui/joy/styles/types';
 import { Box, Button } from '@mui/joy';
 import AddCircleOutlineRoundedIcon from '@mui/icons-material/AddCircleOutlineRounded';
-import ContentCopyIcon from '@mui/icons-material/ContentCopy';
-import TelegramIcon from '@mui/icons-material/Telegram';

 import type { BeamStoreApi } from '../store-beam.hooks';
 import { BeamCard } from '../BeamCard';
@@ -32,10 +30,8 @@ export function BeamRayGrid(props: {
  hadImportedRays: boolean,
  isMobile: boolean,
  onIncreaseRayCount: () => void,
-  onRaysOperation: (operation: 'copy' | 'use') => void,
  rayIds: string[],
  showRayAdd: boolean,
-  showRaysOps: undefined | number,
 }) {

  const raysCount = props.rayIds.length;
@@ -71,25 +67,6 @@ export function BeamRayGrid(props: {
        </BeamCard>
      )}

-      {/* Multi-Use and Copy Buttons */}
-      {!!props.showRaysOps && (
-        <Box sx={{ gridColumn: '1 / -1', display: 'flex', justifyContent: 'center', gap: 2, mt: 2 }}>
-          <Button size='sm' variant='outlined' color='neutral' onClick={() => props.onRaysOperation('copy')} endDecorator={<ContentCopyIcon sx={{ fontSize: 'md' }} />} sx={{
-            backgroundColor: 'background.surface',
-            '&:hover': { backgroundColor: 'background.popup' },
-          }}>
-            Copy {props.showRaysOps}
-          </Button>
-          <Button size='sm' variant='outlined' color='success' onClick={() => props.onRaysOperation('use')} endDecorator={<TelegramIcon sx={{ fontSize: 'xl' }} />} sx={{
-            justifyContent: 'space-between',
-            backgroundColor: 'background.surface',
-            '&:hover': { backgroundColor: 'background.popup' },
-          }}>
-            Use {props.showRaysOps == 2 ? 'both' : 'all ' + props.showRaysOps} messages
-          </Button>
-        </Box>
-      )}
-
      {/*/!* Takes a full row *!/*/}
      {/*<Divider sx={{*/}
      {/*  gridColumn: '1 / -1',*/}
@@ -76,6 +76,12 @@ const createRootSlice: StateCreator<BeamStore, [], [], RootStoreSlice> = (_set,
  open: (chatHistory: Readonly<DMessage[]>, initialChatLlmId: DLLMId | null, isEditMode: boolean, callback: BeamSuccessCallback) => {
    const { isOpen: wasAlreadyOpen, terminateKeepingSettings, loadBeamConfig, hadImportedRays, setRayLlmIds, setCurrentGatherLlmId } = _get();

+    // if already open, preserve the live state (rays, fusions, callback) - re-invocation must never wipe an ongoing beam
+    if (wasAlreadyOpen) {
+      console.warn('[DEV] Beam is already open');
+      return;
+    }
+
    // reset pending operations
    terminateKeepingSettings();

@@ -107,6 +107,7 @@ function _createDLLMFromModelDescription(d: ModelDescriptionSchema, service: DMo
    label: d.label,
    created: d.created || 0,
    updated: d.updated || 0,
+    ...(d.pubDate && { pubDate: d.pubDate }),
    description: d.description,
    hidden: !!d.hidden,

@@ -15,7 +15,7 @@ import WarningRoundedIcon from '@mui/icons-material/WarningRounded';

 import { type DPricingChatGenerate, isLLMChatFree_cached, llmChatPricing_adjusted } from '~/common/stores/llms/llms.pricing';
 import type { ModelOptionsContext } from '~/common/layout/optima/store-layout-optima';
-import { DLLMId, DModelInterfaceV1, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, isLLMVisible, LLM_IF_HOTFIX_NoStream, LLM_IF_HOTFIX_NoTemperature, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';
+import { DLLMId, DModelInterfaceV1, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, getLLMPubDate, isLLMVisible, LLM_IF_HOTFIX_NoStream, LLM_IF_HOTFIX_NoTemperature, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';
 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { GoodModal } from '~/common/components/modals/GoodModal';
 import { LLMImplicitParametersRuntimeFallback } from '~/common/stores/llms/llms.parameters';
@@ -280,6 +280,7 @@ export function LLMOptionsModal(props: { id: DLLMId, context?: ModelOptionsConte

  // cache
  const adjChatPricing = llmChatPricing_adjusted(llm);
+  const pubDate = getLLMPubDate(llm);


  return (
@@ -502,7 +503,8 @@ export function LLMOptionsModal(props: { id: DLLMId, context?: ModelOptionsConte
            id: {llm.id}<br />
            context: <b>{getLLMContextTokens(llm)?.toLocaleString() ?? 'not provided'}</b> tokens{` · `}
            max output: <b>{getLLMMaxOutputTokens(llm)?.toLocaleString() ?? 'not provided'}</b><br />
-            {!!llm.created && <>created: <TimeAgo date={new Date(llm.created * 1000)} /><br /></>}
+            {!!pubDate && <>published: <b>{pubDate.toLocaleDateString(undefined, { year: 'numeric', month: 'short', day: 'numeric' })}</b> · <TimeAgo date={pubDate} /><br /></>}
+            {!!llm.created && <>indexed: <TimeAgo date={new Date(llm.created * 1000)} /><br /></>}
            {/*· tags: {llm.tags.join(', ')}*/}
            {!!adjChatPricing && prettyPricingComponent(adjChatPricing)}
            {/*{!!llm.benchmark && <>benchmark: <b>{llm.benchmark.cbaElo?.toLocaleString() || '(unk) '}</b> CBA Elo<br /></>}*/}
@@ -51,6 +51,7 @@ const _oaiEffortOptions = [
 ] as const;

 const _miscEffortOptions = [
+  { value: 'max', label: 'Max', description: 'Hardest thinking' } as const,
  { value: 'high', label: 'On', description: 'Multi-step reasoning' } as const,
  { value: 'none', label: 'Off', description: 'Disable thinking mode' } as const,
  { value: _UNSPECIFIED, label: 'Default', description: 'Model Default' } as const,
@@ -122,6 +123,11 @@ const _geminiGoogleSearchOptions = [
  { value: _UNSPECIFIED, label: 'Off', description: 'Default (disabled)' },
 ] as const;

+const _geminiAgentVizOptions = [
+  { value: _UNSPECIFIED, label: 'Auto', description: 'Default - agent may include charts/images' },
+  { value: 'off', label: 'Off', description: 'Text only (better when merging multiple reports)' },
+] as const;
+
 const _geminiMediaResolutionOptions = [
  { value: 'mr_high', label: 'High', description: 'Best quality' },
  { value: 'mr_medium', label: 'Medium', description: 'Balanced' },
@@ -244,6 +250,7 @@ export function LLMParametersEditor(props: {
    llmVndAntWebSearch,
    llmVndAntWebSearchMaxUses,
    llmVndGemEffort,
+    llmVndGeminiAgentViz,
    llmVndGeminiAspectRatio,
    llmVndGeminiCodeExecution,
    llmVndGeminiGoogleSearch,
@@ -686,6 +693,19 @@ export function LLMParametersEditor(props: {
      />
    )}

+    {showParam('llmVndGeminiAgentViz') && (
+      <FormSelectControl
+        title='Visualizations'
+        tooltip='Charts and images in Deep Research reports. Disable for text-only output (helpful when merging multiple reports).'
+        value={llmVndGeminiAgentViz ?? _UNSPECIFIED}
+        onChange={(value) => {
+          if (value === _UNSPECIFIED || !value) onRemoveParameter('llmVndGeminiAgentViz');
+          else onChangeParameter({ llmVndGeminiAgentViz: value });
+        }}
+        options={_geminiAgentVizOptions}
+      />
+    )}
+

    {/*{showParam('llmVndMoonshotWebSearch') && (*/}
    {/*  <FormSelectControl*/}
@@ -9,11 +9,12 @@ import VisibilityOutlinedIcon from '@mui/icons-material/VisibilityOutlined';

 import type { DModelsServiceId } from '~/common/stores/llms/llms.service.types';
 import { isLLMChatFree_cached } from '~/common/stores/llms/llms.pricing';
-import { DLLM, DLLMId, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, isLLMCustomUserParameters, isLLMHidden, LLM_IF_ANT_PromptCaching, LLM_IF_GEM_CodeExecution, LLM_IF_OAI_Fn, LLM_IF_OAI_Json, LLM_IF_OAI_PromptCaching, LLM_IF_OAI_Reasoning, LLM_IF_OAI_Vision, LLM_IF_Outputs_Audio, LLM_IF_Outputs_Image, LLM_IF_Tools_WebSearch } from '~/common/stores/llms/llms.types';
+import { DLLM, DLLMId, getLLMContextTokens, getLLMLabel, getLLMMaxOutputTokens, getLLMPubDate, isLLMCustomUserParameters, isLLMHidden, LLM_IF_ANT_PromptCaching, LLM_IF_GEM_CodeExecution, LLM_IF_OAI_Fn, LLM_IF_OAI_Json, LLM_IF_OAI_PromptCaching, LLM_IF_OAI_Reasoning, LLM_IF_OAI_Vision, LLM_IF_Outputs_Audio, LLM_IF_Outputs_Image, LLM_IF_Tools_WebSearch } from '~/common/stores/llms/llms.types';
 import { GoodTooltip } from '~/common/components/GoodTooltip';
 import { PhGearSixIcon } from '~/common/components/icons/phosphor/PhGearSixIcon';
 import { STAR_EMOJI, StarredToggle, starredToggleStyle } from '~/common/components/StarIcons';
 import { findModelsServiceOrNull, llmsStoreActions } from '~/common/stores/llms/store-llms';
+import { sortLLMsByServiceLabel } from '~/common/stores/llms/components/llms.dropdown.utils';
 import { useLLMsByService } from '~/common/stores/llms/llms.hooks';
 import { useIsMobile } from '~/common/components/useMatchMedia';
 import { useModelDomains } from '~/common/stores/llms/hooks/useModelDomains';
@@ -98,6 +99,10 @@ export const ModelItem = React.memo(function ModelItem(props: {
  const isNotSymlink = !llm.label.startsWith('🔗'); // getLLMLabel exception: need access to the base
  const llmLabel = getLLMLabel(llm);

+  // "new" badge: shown only when pubDate is set AND within the last 30 days
+  const pubDate = getLLMPubDate(llm);
+  const isRecentlyPublished = pubDate ? (Date.now() - pubDate.getTime()) < 30 * 24 * 60 * 60 * 1000 : false;
+

  const handleLLMConfigure = React.useCallback((event: React.MouseEvent) => {
    event.stopPropagation();
@@ -226,6 +231,7 @@ export const ModelItem = React.memo(function ModelItem(props: {
        </>}

        {/* Features Chips - sync with `useLLMSelect.tsx` */}
+        {isRecentlyPublished && isNotSymlink && pubDate && <GoodTooltip title={`Released ${pubDate.toLocaleDateString(undefined, { year: 'numeric', month: 'short', day: 'numeric' })}`}><Chip size='sm' variant='solid' sx={isHidden ? styles.chipDisabled : { bgcolor: '#d4ff3a', color: 'black', fontWeight: 'lg' }}>new</Chip></GoodTooltip>}
        {featuresChipMemo}
        {seemsFree && isNotSymlink && <Chip size='sm' color='success' variant='plain' sx={isHidden ? styles.chipDisabled : styles.chipFree}>free</Chip>}

@@ -283,7 +289,9 @@ export function ModelsList(props: {

    // are we showing multiple services
    const showAllServices = !props.filterServiceId;
-    const hasManyServices = llms.length >= 2 && llms.some(llm => llm.sId !== llms[0].sId);
+    // sort by service label so vendor groups appear alphabetically when showing all services (single-service view keeps existing order)
+    const orderedLLMs = showAllServices ? sortLLMsByServiceLabel(llms) : llms;
+    const hasManyServices = orderedLLMs.length >= 2 && orderedLLMs.some(llm => llm.sId !== orderedLLMs[0].sId);
    let lastGroupLabel = '';

    // derived
@@ -293,7 +301,7 @@ export function ModelsList(props: {

    // generate the list items, prepending headers when necessary
    const items: React.JSX.Element[] = [];
-    for (const llm of llms) {
+    for (const llm of orderedLLMs) {

      // skip hidden models if requested
      if (!props.showHiddenModels && isLLMHidden(llm))
@@ -177,7 +177,8 @@ export function anthropicBetaFeatures(options?: AnthropicHostedFeatures): string
  if (options?.enable1MContext)
    bf.add('context-1m-2025-08-07');

-  // Code execution (for dynamic web tools PFC, or Skills) + files API for container downloads
+  // Code execution (for Skills, container reuse, Programmatic Tool Calling) + files API for container downloads.
+  // NOT enabled for dynamic web tools (_20260209): those have internal code execution managed by Anthropic.
  // Note: SDK defines code-execution-2025-05-22; we use 2025-08-25 (newer iteration, not yet in SDK types).
  // Code execution may be GA now (most SDK examples skip the beta namespace), but keeping for safety.
  if (options?.enableCodeExecution) {
@@ -6,7 +6,7 @@ import { Release } from '~/common/app.release';

 import type { ModelDescriptionSchema, OrtVendorLookupResult } from '../llm.server.types';
 import { createVariantInjector, ModelVariantMap } from '../llm.server.variants';
-import { llmDevCheckModels_DEV } from '../models.mappings';
+import { formatPubDate, llmDevCheckModels_DEV } from '../models.mappings';


 // Note: these model definitions are shared across Anthropic API, OpenRouter, and AWS Bedrock.
@@ -214,12 +214,13 @@ export function llmsAntInjectVariants(acc: ModelDescriptionSchema[], model: Mode
 }


-export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: boolean })[] = [
+export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: boolean, pubDate: string /* make it required for the defs */ })[] = [

  // Claude 4.7 models
  {
    id: 'claude-opus-4-7', // Active - 2026-04-16
    label: 'Claude Opus 4.7',
+    pubDate: '20260416',
    description: 'Most capable generally available model for complex reasoning and agentic coding',
    contextWindow: 1_000_000, // 1M GA at standard pricing (no opt-in required)
    maxCompletionTokens: 128000,
@@ -239,6 +240,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-opus-4-6', // Active
    label: 'Claude Opus 4.6',
+    pubDate: '20260205',
    description: 'Previous most intelligent model for complex agents and coding, with adaptive thinking',
    contextWindow: 1_000_000, // 1M GA at standard pricing since 2026-03-13 (no opt-in required)
    maxCompletionTokens: 128000,
@@ -255,6 +257,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-sonnet-4-6', // Active
    label: 'Claude Sonnet 4.6',
+    pubDate: '20260217',
    description: 'Best combination of speed and intelligence for everyday tasks',
    contextWindow: 1_000_000, // 1M GA at standard pricing since 2026-03-13 (no opt-in required)
    maxCompletionTokens: 128000, // docs say 64000, API reports 128000
@@ -272,6 +275,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-opus-4-5-20251101', // Active
    label: 'Claude Opus 4.5',
+    pubDate: '20251124',
    description: 'Previous most intelligent model with advanced reasoning for complex agentic workflows',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
@@ -286,6 +290,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-sonnet-4-5-20250929', // Active
    label: 'Claude Sonnet 4.5',
+    pubDate: '20250929',
    description: 'Previous best combination of speed and intelligence for complex agents and coding',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
@@ -311,6 +316,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-haiku-4-5-20251001', // Active
    label: 'Claude Haiku 4.5',
+    pubDate: '20251015',
    description: 'Fastest model with exceptional speed and performance',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
@@ -324,6 +330,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-opus-4-1-20250805', // Active
    label: 'Claude Opus 4.1',
+    pubDate: '20250805',
    description: 'Exceptional model for specialized complex tasks requiring advanced reasoning',
    contextWindow: 200000,
    maxCompletionTokens: 32000,
@@ -338,6 +345,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    hidden: true, // Deprecated: April 14, 2026 | Retiring: June 15, 2026 | Replacement: claude-opus-4-7
    id: 'claude-opus-4-20250514', // Deprecated
    label: 'Claude Opus 4 [Deprecated]',
+    pubDate: '20250522',
    description: 'Previous flagship model. Deprecated April 14, 2026, retiring June 15, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 32000,
@@ -351,6 +359,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    hidden: true, // Deprecated: April 14, 2026 | Retiring: June 15, 2026 | Replacement: claude-sonnet-4-6
    id: 'claude-sonnet-4-20250514', // Deprecated
    label: 'Claude Sonnet 4 [Deprecated]',
+    pubDate: '20250522',
    description: 'High-performance model. Deprecated April 14, 2026, retiring June 15, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
@@ -379,6 +388,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-3-7-sonnet-20250219', // Retired | Deprecated: October 28, 2025 | Retired: February 19, 2026 | Replacement: claude-opus-4-6
    label: 'Claude Sonnet 3.7 [Retired]',
+    pubDate: '20250224',
    description: 'High-performance model with early extended thinking. Retired February 19, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
@@ -396,6 +406,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  {
    id: 'claude-3-5-haiku-20241022', // Retired | Deprecated: December 19, 2025 | Retired: February 19, 2026
    label: 'Claude Haiku 3.5 [Retired]',
+    pubDate: '20241104',
    description: 'Intelligence at blazing speeds. Retired February 19, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 8192,
@@ -413,6 +424,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    hidden: true, // deprecated
    id: 'claude-3-haiku-20240307', // Deprecated | Deprecated: February 19, 2026 | Retiring: April 20, 2026 | Replacement: claude-haiku-4-5-20251001
    label: 'Claude Haiku 3 [Deprecated]',
+    pubDate: '20240313',
    description: 'Fast and compact model for near-instant responsiveness. Deprecated February 19, 2026, retiring April 20, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 4096,
@@ -595,11 +607,13 @@ export function llmsAntCreatePlaceholderModel(model: AnthropicWire_API_Models_Li
  parameterSpecs.push(...ANT_TOOLS);

  const maxInputTokens = model.max_input_tokens;
+  const createdAt = model.created_at ? new Date(model.created_at) : undefined;
  return {
    id: model.id,
    idVariant: '::placeholder',
    label: model.display_name,
-    created: Math.round(new Date(model.created_at).getTime() / 1000),
+    created: createdAt ? Math.round(createdAt.getTime() / 1000) : undefined,
+    pubDate: formatPubDate(createdAt), // 0-day: use Anthropic API's created_at, or today if unset
    description: 'Newest model, description not available yet.',
    contextWindow: maxInputTokens ?? 200_000, // report API value as-is (no cap for unknown models)
    maxCompletionTokens: model.max_tokens || 32768,
@@ -755,5 +769,5 @@ export function llmOrtAntLookup_ThinkingVariants(orModelName: string): OrtVendor
    .map((spec) => ({ ...spec }));

  // initialTemperature: not set - Anthropic models use the global fallback (0.5)
-  return { interfaces, parameterSpecs };
+  return { pubDate: model.pubDate, interfaces, parameterSpecs };
 }
@@ -6,7 +6,7 @@ import { Release } from '~/common/app.release';

 import type { ModelDescriptionSchema, OrtVendorLookupResult } from '../llm.server.types';
 import { createVariantInjector, ModelVariantMap } from '../llm.server.variants';
-import { llmDevCheckModels_DEV } from '../models.mappings';
+import { formatPubDate, llmDevCheckModels_DEV } from '../models.mappings';


 // dev options
@@ -72,7 +72,7 @@ const geminiExpFree: ModelDescriptionSchema['chatPrice'] = {
 };


-// Pricing based on https://ai.google.dev/pricing (Apr 22, 2026)
+// Pricing based on https://ai.google.dev/pricing (Apr 24, 2026)

 const gemini31FlashLitePricing: ModelDescriptionSchema['chatPrice'] = {
  input: 0.25, // text/image/video; audio is $0.50 but we don't differentiate yet
@@ -186,7 +186,7 @@ const _knownGeminiModels: ({
  symLink?: string,
  deprecated?: string, // Gemini may provide deprecation dates
  // _delete removed - models are now physically removed from the list instead of marked for deletion
-} & Pick<ModelDescriptionSchema, 'interfaces' | 'parameterSpecs' | 'chatPrice' | 'hidden' | 'benchmark'>)[] = [
+} & Pick<ModelDescriptionSchema, 'pubDate' | 'interfaces' | 'parameterSpecs' | 'chatPrice' | 'hidden' | 'benchmark'> & { pubDate: string /* make it required */})[] = [

  /// Generation 3.1

@@ -195,6 +195,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-3.1-pro-preview',
    labelOverride: 'Gemini 3.1 Pro Preview',
+    pubDate: '20260219',
    isPreview: true,
    chatPrice: gemini30ProPricing, // same pricing as 3 Pro
    interfaces: IF_30,
@@ -213,6 +214,7 @@ const _knownGeminiModels: ({
    hidden: true, // specialized variant for custom tool prioritization
    id: 'models/gemini-3.1-pro-preview-customtools',
    labelOverride: 'Gemini 3.1 Pro Preview (Custom Tools)',
+    pubDate: '20260219',
    isPreview: true,
    chatPrice: gemini30ProPricing,
    interfaces: IF_30,
@@ -230,6 +232,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-3.1-flash-image-preview',
    labelOverride: 'Nano Banana 2',
+    pubDate: '20260226',
    isPreview: true,
    chatPrice: gemini31FlashImagePricing,
    interfaces: IF_30,
@@ -247,6 +250,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-3.1-flash-lite-preview',
    labelOverride: 'Gemini 3.1 Flash-Lite Preview',
+    pubDate: '20260303',
    isPreview: true,
    chatPrice: gemini31FlashLitePricing,
    interfaces: IF_30,
@@ -262,10 +266,13 @@ const _knownGeminiModels: ({

  /// Generation 3.0

-  // 3.0 Pro (Preview) - Released November 18, 2025; DEPRECATED: shutdown March 9, 2026 (still served by API as of Apr 17, 2026)
+  // 3.0 Pro (Preview) - Released November 18, 2025; SHUT DOWN March 9, 2026 - now silently routed to gemini-3.1-pro-preview
+  // Kept hidden (still returned by API) to avoid confusing users with a silently-redirected model.
  {
+    hidden: true, // March 9, 2026: API silently routes 'gemini-3-pro-preview' to 'gemini-3.1-pro-preview' - hide to prevent user confusion
    id: 'models/gemini-3-pro-preview',
    labelOverride: 'Gemini 3 Pro Preview',
+    pubDate: '20251118',
    isPreview: true,
    deprecated: '2026-03-09',
    chatPrice: gemini30ProPricing,
@@ -284,6 +291,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-3-pro-image-preview',
    labelOverride: 'Nano Banana Pro', // Marketing name for the technical model ID
+    pubDate: '20251120',
    isPreview: true,
    chatPrice: gemini30ProImagePricing,
    interfaces: IF_30,
@@ -299,6 +307,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/nano-banana-pro-preview',
    labelOverride: 'Nano Banana Pro',
+    pubDate: '20251120',
    symLink: 'models/gemini-3-pro-image-preview',
    // copied from symlink
    isPreview: true,
@@ -318,6 +327,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-3-flash-preview',
    labelOverride: 'Gemini 3 Flash Preview',
+    pubDate: '20251217',
    isPreview: true,
    chatPrice: gemini30FlashPricing,
    interfaces: IF_30,
@@ -335,8 +345,10 @@ const _knownGeminiModels: ({

  // 2.5 Pro (Stable) - Released June 17, 2025; DEPRECATED: shutdown June 17, 2026
  {
+    hidden: true, // outperformed by 3.1 Pro (1493) and even 3 Flash (1474) - deprecated in 2 months
    id: 'models/gemini-2.5-pro',
    labelOverride: 'Gemini 2.5 Pro',
+    pubDate: '20250617',
    deprecated: '2026-06-17',
    chatPrice: gemini25ProPricing,
    interfaces: IF_25,
@@ -359,6 +371,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // single-turn-only model - unhide and just send a message to make use of this
    id: 'models/gemini-2.5-pro-preview-tts',
+    pubDate: '20250520',
    isPreview: true,
    chatPrice: gemini25ProPreviewTTSPricing,
    interfaces: [
@@ -376,10 +389,11 @@ const _knownGeminiModels: ({
  {
    id: 'models/deep-research-preview-04-2026',
    labelOverride: 'Deep Research Preview (2026-04)',
+    pubDate: '20260421',
    isPreview: true,
    chatPrice: gemini25ProPricing, // pricing not explicitly listed; using 2.5 Pro as baseline
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
-    parameterSpecs: [],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
+    parameterSpecs: [{ paramId: 'llmVndGeminiAgentViz' }],
    benchmark: undefined, // Deep research model, not benchmarkable on standard tests
    // 128K input, 64K output
  },
@@ -388,22 +402,24 @@ const _knownGeminiModels: ({
  {
    id: 'models/deep-research-max-preview-04-2026',
    labelOverride: 'Deep Research Max Preview (2026-04)',
+    pubDate: '20260421',
    isPreview: true,
    chatPrice: gemini25ProPricing, // baseline estimate (see note above)
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
-    parameterSpecs: [],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
+    parameterSpecs: [{ paramId: 'llmVndGeminiAgentViz' }],
    benchmark: undefined, // Deep research model, not benchmarkable on standard tests
  },

-  // Deep Research Pro Preview - Released December 12, 2025
+  // Deep Research Pro Preview - Released December 11, 2025
  {
    hidden: true, // yield to newer 2026-04 models
    id: 'models/deep-research-pro-preview-12-2025',
    labelOverride: 'Deep Research Pro Preview',
+    pubDate: '20251211',
    isPreview: true,
    chatPrice: gemini25ProPricing,
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
-    parameterSpecs: [{ paramId: 'llmVndGeminiThinkingBudget' }],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Reasoning, LLM_IF_GEM_Interactions],
+    parameterSpecs: [{ paramId: 'llmVndGeminiAgentViz' }, { paramId: 'llmVndGeminiThinkingBudget' }],
    benchmark: undefined, // Deep research model, not benchmarkable on standard tests
    // Note: 128K input context, 64K output context
  },
@@ -412,8 +428,10 @@ const _knownGeminiModels: ({

  // 2.5 Flash
  {
+    hidden: true, // outperformed by 3 Flash Preview (1474 vs 1411) - deprecated in 2 months
    id: 'models/gemini-2.5-flash',
    labelOverride: 'Gemini 2.5 Flash',
+    pubDate: '20250617',
    deprecated: '2026-06-17',
    chatPrice: gemini25FlashPricing,
    interfaces: IF_25,
@@ -441,6 +459,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-2.5-computer-use-preview-10-2025',
    labelOverride: 'Gemini 2.5 Computer Use Preview 10-2025',
+    pubDate: '20251007',
    isPreview: true,
    chatPrice: gemini25ProPricing, // Uses same pricing as 2.5 Pro (pricing page doesn't list separately)
    // NOTE: sweep shows fn=['auto'] only (no 'roundtrip') - partial Fn capability, do not advertise LLM_IF_OAI_Fn
@@ -458,6 +477,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-robotics-er-1.6-preview',
    labelOverride: 'Gemini Robotics-ER 1.6 Preview',
+    pubDate: '20260414',
    isPreview: true,
    chatPrice: geminiRoboticsER16Pricing,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Reasoning],
@@ -467,8 +487,10 @@ const _knownGeminiModels: ({

  // 2.5 Flash-Based: Gemini Robotics-ER 1.5 Preview - Released September 25, 2025; DEPRECATED: shutdown April 30, 2026
  {
+    hidden: true, // superseded by Robotics-ER 1.6 - shutdown April 30, 2026
    id: 'models/gemini-robotics-er-1.5-preview',
    labelOverride: 'Gemini Robotics-ER 1.5 Preview',
+    pubDate: '20250925',
    isPreview: true,
    deprecated: '2026-04-30',
    chatPrice: gemini25FlashPricing, // Uses same pricing as 2.5 Flash per pricing page
@@ -481,6 +503,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-2.5-flash-image',
    labelOverride: 'Nano Banana',
+    pubDate: '20251002',
    deprecated: '2026-10-02',
    chatPrice: { input: 0.30, output: undefined }, // Per pricing page: $0.30 text/image input, $0.039 per image output, but the text output is not stated
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -501,6 +524,7 @@ const _knownGeminiModels: ({
    hidden: true, // audio outputs are unavailable
    id: 'models/gemini-3.1-flash-tts-preview',
    labelOverride: 'Gemini 3.1 Flash TTS Preview',
+    pubDate: '20260415',
    isPreview: true,
    chatPrice: gemini31FlashTTSPricing,
    interfaces: [
@@ -516,6 +540,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // audio outputs are unavailable as of 2025-05-27
    id: 'models/gemini-2.5-flash-preview-tts',
+    pubDate: '20250520',
    isPreview: true,
    chatPrice: gemini25FlashPreviewTTSPricing,
    interfaces: [
@@ -543,6 +568,7 @@ const _knownGeminiModels: ({
  {
    id: 'models/gemini-2.5-flash-lite',
    labelOverride: 'Gemini 2.5 Flash-Lite',
+    pubDate: '20250722',
    deprecated: '2026-07-22',
    chatPrice: gemini25FlashLitePricing,
    interfaces: IF_25,
@@ -573,14 +599,18 @@ const _knownGeminiModels: ({

  // 2.0 Flash - DEPRECATED: shutdown June 1, 2026 (announced Feb 18, 2026)
  {
+    hidden: true, // outclassed by all Flash models in 2.5/3.x series - shutdown in ~5 weeks
    id: 'models/gemini-2.0-flash-001',
+    pubDate: '20250205',
    deprecated: '2026-06-01',
    chatPrice: gemini20FlashPricing,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_GEM_CodeExecution],
    benchmark: { cbaElo: 1360 }, // gemini-2.0-flash-001
  },
  {
+    hidden: true, // outclassed by all Flash models in 2.5/3.x series - shutdown in ~5 weeks
    id: 'models/gemini-2.0-flash',
+    pubDate: '20250205',
    symLink: 'models/gemini-2.0-flash-001',
    deprecated: '2026-06-01',
    // copied from symlink
@@ -591,7 +621,9 @@ const _knownGeminiModels: ({

  // 2.0 Flash Lite - DEPRECATED: shutdown June 1, 2026 (announced Feb 18, 2026)
  {
+    hidden: true, // outclassed by 2.5/3.1 Flash-Lite - shutdown in ~5 weeks
    id: 'models/gemini-2.0-flash-lite',
+    pubDate: '20250225',
    chatPrice: gemini20FlashLitePricing,
    symLink: 'models/gemini-2.0-flash-lite-001',
    deprecated: '2026-06-01',
@@ -599,7 +631,9 @@ const _knownGeminiModels: ({
    benchmark: { cbaElo: 1310 },
  },
  {
+    hidden: true, // outclassed by 2.5/3.1 Flash-Lite - shutdown in ~5 weeks
    id: 'models/gemini-2.0-flash-lite-001',
+    pubDate: '20250225',
    chatPrice: gemini20FlashLitePricing,
    deprecated: '2026-06-01',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
@@ -639,6 +673,7 @@ const _knownGeminiModels: ({
  // Gemma 4 Models - Released April 2, 2026
  {
    id: 'models/gemma-4-31b-it',
+    pubDate: '20260402',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    parameterSpecs: [{ paramId: 'llmVndGemEffort', enumValues: ['minimal', 'high'] }],
@@ -648,6 +683,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // smaller MoE variant
    id: 'models/gemma-4-26b-a4b-it',
+    pubDate: '20260402',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    parameterSpecs: [{ paramId: 'llmVndGemEffort', enumValues: ['minimal', 'high'] }],
@@ -658,6 +694,7 @@ const _knownGeminiModels: ({
  // Gemma 3n Model (newer than 3, first seen on the May 2025 update)
  {
    id: 'models/gemma-3n-e4b-it',
+    pubDate: '20250626',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree, // Free tier only according to pricing page
@@ -665,6 +702,7 @@ const _knownGeminiModels: ({
  },
  {
    id: 'models/gemma-3n-e2b-it',
+    pubDate: '20250626',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree, // Free tier only according to pricing page
@@ -676,6 +714,7 @@ const _knownGeminiModels: ({
  // - LLM_IF_HOTFIX_Sys0ToUsr0, because: "Developer instruction is not enabled for models/gemma-3-27b-it"
  {
    id: 'models/gemma-3-27b-it',
+    pubDate: '20250312',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree, // Pricing page indicates free tier only
@@ -685,6 +724,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // keep larger model
    id: 'models/gemma-3-12b-it',
+    pubDate: '20250312',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree,
@@ -693,6 +733,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // keep larger model
    id: 'models/gemma-3-4b-it',
+    pubDate: '20250312',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree,
@@ -701,6 +742,7 @@ const _knownGeminiModels: ({
  {
    hidden: true, // keep larger model
    id: 'models/gemma-3-1b-it',
+    pubDate: '20250312',
    isPreview: true,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_HOTFIX_StripImages, LLM_IF_HOTFIX_Sys0ToUsr0],
    chatPrice: geminiExpFree,
@@ -939,6 +981,7 @@ export function geminiModelToModelDescription(geminiModel: GeminiWire_API_Models
    label: label,
    // created: ...
    // updated: ...
+    pubDate: knownModel?.pubDate ?? formatPubDate(), // 0-day fallback; the editorial entry is the source of truth; today's date is a placeholder until editorial catches up
    description: descriptionLong,
    contextWindow: contextWindow,
    maxCompletionTokens: outputTokenLimit,
@@ -1026,5 +1069,5 @@ export function llmOrtGemLookup(orModelName: string): OrtVendorLookupResult | un
    ?.filter(spec => _ORT_GEM_PARAM_ALLOWLIST.has(spec.paramId))
    .map(spec => ({ ...spec }));

-  return { interfaces, parameterSpecs, initialTemperature: GEMINI_DEFAULT_TEMPERATURE };
+  return { pubDate: knownModel.pubDate, interfaces, parameterSpecs, initialTemperature: GEMINI_DEFAULT_TEMPERATURE };
 }
@@ -94,6 +94,7 @@ const ModelParameterSpec_schema = z.object({
    // Bedrock
    'llmVndBedrockAPI',
    // Gemini
+    'llmVndGeminiAgentViz',
    'llmVndGeminiAspectRatio',
    'llmVndGeminiCodeExecution',
    'llmVndGeminiComputerUse',
@@ -137,6 +138,7 @@ export const ModelDescription_schema = z.object({
  label: z.string(),
  created: z.int().optional(),
  updated: z.int().optional(),
+  pubDate: z.string().regex(/^\d{8}$/).optional(), // editorial: model's official public release date 'YYYYMMDD'. Required for editorial entries (KnownModelEditorial) and for 0-day-fillable paths (Anthropic placeholder, Gemini unknown-model fallback). Omitted for dynamic-only vendors and unknown variants where we have no reliable signal.
  description: z.string(),
  contextWindow: z.int().nullable(),
  interfaces: z.array(z.enum(LLMS_ALL_INTERFACES).or(z.string())), // backward compatibility: to not Break client-side interface parsing on newer server
@@ -155,6 +157,7 @@ export const ModelDescription_schema = z.object({
 // Each vendor's lookup filters to only what works through OpenRouter's OAI-compatible API.
 // OpenRouter merges these with its own auto-detected interfaces and params.
 export type OrtVendorLookupResult = {
+  pubDate?: ModelDescriptionSchema['pubDate'];
  interfaces?: ModelDescriptionSchema['interfaces'];
  parameterSpecs?: ModelDescriptionSchema['parameterSpecs'];
  initialTemperature?: number; // vendor-specific default (e.g. Gemini 1.0); undefined = use global fallback (0.5)
@@ -111,6 +111,28 @@ export function llmDevValidateParameterSpecs_DEV(model: ModelDescriptionSchema):
 }


+// -- pubDate helpers --
+
+/**
+ * Format an epoch / Date / nothing as 'YYYYMMDD'.
+ * Accepts either a Unix epoch (seconds), a Date, or undefined (-> today).
+ */
+export function formatPubDate(input?: number | Date): string {
+  let date: Date;
+  if (input instanceof Date && Number.isFinite(input.getTime()))
+    date = input;
+  else if (typeof input === 'number' && Number.isFinite(input) && input > 0) {
+    const candidate = new Date(input * 1000);
+    date = Number.isFinite(candidate.getTime()) ? candidate : new Date();
+  } else
+    date = new Date();
+  const y = date.getUTCFullYear();
+  const m = String(date.getUTCMonth() + 1).padStart(2, '0');
+  const d = String(date.getUTCDate()).padStart(2, '0');
+  return `${y}${m}${d}`;
+}
+
+
 // -- Manual model mappings: types and helper --

 export type ManualMappings = (KnownModel | KnownLink)[];
@@ -224,6 +246,7 @@ export function fromManualMapping(mappings: (KnownModel | KnownLink)[], upstream
  };

  // apply optional fields
+  if (m.pubDate) md.pubDate = m.pubDate;
  if (m.parameterSpecs) md.parameterSpecs = m.parameterSpecs;
  if (m.maxCompletionTokens) md.maxCompletionTokens = m.maxCompletionTokens;
  if (m.benchmark) md.benchmark = m.benchmark;
@@ -1,38 +1,72 @@
-import { LLM_IF_HOTFIX_StripImages, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';
+import { LLM_IF_HOTFIX_StripImages, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Reasoning } from '~/common/stores/llms/llms.types';

 import type { ModelDescriptionSchema } from '../../llm.server.types';

 import { fromManualMapping, ManualMappings } from '../../models.mappings';


-const IF_3 = [LLM_IF_HOTFIX_StripImages, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json];
+const IF_4 = [LLM_IF_HOTFIX_StripImages, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn];

+// [DeepSeek, 2026-04-24] V4 release - https://api-docs.deepseek.com/news/news260424
+// - V4-Pro: 1.6T total / 49B active params; V4-Flash: 284B total / 13B active params (Novel Attention: token-wise compression + DSA)
+// - Model IDs listed by /models: deepseek-v4-flash, deepseek-v4-pro
+// - 1M context is the default across services; text-only (no vision/multimodal)
+// - Legacy aliases still accepted until 2026-07-24: deepseek-chat -> v4-flash (thinking disabled), deepseek-reasoner -> v4-flash (thinking enabled)
+// - Reasoning control: object `thinking: { type: 'enabled'|'disabled', reasoning_effort?: 'high'|'max' }`
+//   (the live API also accepts type: 'adaptive', but it is undocumented and empirically behaves the same as 'enabled'
+//    on current builds -- deliberately not exposed here; add it once docs + semantics stabilize)
+// - V3.2 endpoints no longer accessible via direct model ID (API returns only v4-flash/v4-pro)
 const _knownDeepseekChatModels: ManualMappings = [
-  // [Models and Pricing](https://api-docs.deepseek.com/quick_start/pricing)
-  // [List Models](https://api-docs.deepseek.com/api/list-models)
-  // [Release Notes - V3.2](https://api-docs.deepseek.com/news/news251201) - Released 2025-12-01
+  {
+    idPrefix: 'deepseek-v4-pro',
+    label: 'DeepSeek V4 Pro',
+    pubDate: '20260424',
+    description: 'Premium reasoning model with 1M context. Supports extended thinking modes, JSON output, and function calling.',
+    contextWindow: 1_048_576, // 1M
+    interfaces: [...IF_4, LLM_IF_OAI_Reasoning],
+    parameterSpecs: [
+      { paramId: 'llmVndMiscEffort', enumValues: ['none', 'high', 'max'] },
+    ],
+    maxCompletionTokens: 65536, // conservative default; docs advertise up to 384K
+    chatPrice: { input: 1.74, output: 3.48, cache: { cType: 'oai-ac', read: 0.145 } },
+    benchmark: { cbaElo: 1463 }, // lmarena: deepseek-v4-pro (thinking variant 1462, near-tied)
+  },
+  {
+    idPrefix: 'deepseek-v4-flash',
+    label: 'DeepSeek V4 Flash',
+    pubDate: '20260424',
+    description: 'Fast general-purpose model with 1M context. Supports extended thinking modes, JSON output, and function calling.',
+    contextWindow: 1_048_576, // 1M
+    interfaces: [...IF_4, LLM_IF_OAI_Reasoning],
+    parameterSpecs: [
+      { paramId: 'llmVndMiscEffort', enumValues: ['none', 'high', 'max'] },
+    ],
+    maxCompletionTokens: 65536, // conservative default; docs advertise up to 384K
+    chatPrice: { input: 0.14, output: 0.28, cache: { cType: 'oai-ac', read: 0.028 } },
+    benchmark: { cbaElo: 1439 }, // lmarena: deepseek-v4-flash-thinking (non-thinking variant 1433)
+  },
+  // Legacy aliases - API routes both to deepseek-v4-flash with thinking pre-set
  {
    idPrefix: 'deepseek-reasoner',
-    label: 'DeepSeek V3.2 (Reasoner)',
-    description: 'Reasoning model with Chain-of-Thought capabilities, 128K context length. Supports JSON output and function calling.',
-    contextWindow: 131072, // 128K
-    interfaces: [...IF_3, LLM_IF_OAI_Reasoning],
-    // parameterSpecs: [
-    //   { paramId: 'llmVndMiscEffort', enumValues: ['none', 'high'] }, // not supported: this model is reasoning only
-    // ],
-    maxCompletionTokens: 32768, // default, max: 65536
-    chatPrice: { input: 0.28, output: 0.42, cache: { cType: 'oai-ac', read: 0.028 } },
-    benchmark: { cbaElo: 1425 }, // deepseek-v3.2-exp-thinking
+    label: 'DeepSeek Reasoner (legacy)',
+    description: 'Legacy alias: routes to DeepSeek V4 Flash with thinking enabled. Retires 2026-07-24.',
+    contextWindow: 1_048_576,
+    interfaces: [...IF_4, LLM_IF_OAI_Reasoning],
+    maxCompletionTokens: 65536,
+    chatPrice: { input: 0.14, output: 0.28, cache: { cType: 'oai-ac', read: 0.028 } },
+    benchmark: { cbaElo: 1439 }, // lmarena: deepseek-v4-flash-thinking
+    isLegacy: true,
  },
  {
    idPrefix: 'deepseek-chat',
-    label: 'DeepSeek V3.2',
-    description: 'General-purpose model with 128K context length. Supports JSON output and function calling.',
-    contextWindow: 131072, // 128K
-    interfaces: IF_3,
-    maxCompletionTokens: 8192, // default is 4096, max is 8192
-    chatPrice: { input: 0.28, output: 0.42, cache: { cType: 'oai-ac', read: 0.028 } },
-    benchmark: { cbaElo: 1424 }, // deepseek-v3.2
+    label: 'DeepSeek Chat (legacy)',
+    description: 'Legacy alias: routes to DeepSeek V4 Flash with thinking disabled. Retires 2026-07-24.',
+    contextWindow: 1_048_576,
+    interfaces: IF_4,
+    maxCompletionTokens: 65536,
+    chatPrice: { input: 0.14, output: 0.28, cache: { cType: 'oai-ac', read: 0.028 } },
+    benchmark: { cbaElo: 1433 }, // lmarena: deepseek-v4-flash (non-thinking)
+    isLegacy: true,
  },
 ];

@@ -23,6 +23,7 @@ const _knownGroqModels: ManualMappings = [
    isPreview: true,
    idPrefix: 'meta-llama/llama-4-scout-17b-16e-instruct',
    label: 'Llama 4 Scout · 17B × 16E (Preview)',
+    pubDate: '20250405',
    description: 'Llama 4 Scout 17B MoE with 16 experts (109B total params), native multimodal with vision support. 131K context, 8K max output. ~750 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 8192,
@@ -33,6 +34,7 @@ const _knownGroqModels: ManualMappings = [
    isPreview: true,
    idPrefix: 'qwen/qwen3-32b',
    label: 'Qwen 3 · 32B (Preview)',
+    pubDate: '20250428',
    description: 'Qwen3 32B by Alibaba Cloud. Supports thinking/non-thinking modes, 100+ languages. 131K context, 40K max output. ~400 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 40960,
@@ -43,6 +45,7 @@ const _knownGroqModels: ManualMappings = [
    isPreview: true,
    idPrefix: 'moonshotai/kimi-k2-instruct-0905',
    label: 'Kimi K2 Instruct 0905 (Preview)',
+    pubDate: '20250905',
    description: 'Kimi K2 1T MoE model (32B active, 384 experts). Advanced agentic coding. 262K context, 16K max output. ~200 t/s on Groq.',
    contextWindow: 262144,
    maxCompletionTokens: 16384,
@@ -53,6 +56,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'moonshotai/kimi-k2-instruct',
    label: 'Kimi K2 Instruct (Deprecated)',
+    pubDate: '20250711',
    symLink: 'moonshotai/kimi-k2-instruct-0905',
    contextWindow: 131072, // API returns 131K (vs 262K for the 0905 version)
    maxCompletionTokens: 16384,
@@ -69,6 +73,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'groq/compound',
    label: 'Compound (Agentic System)',
+    pubDate: '20250904',
    description: 'Groq agentic AI with web search, code execution, browser automation. Uses GPT-OSS 120B, Llama 4 Scout, Llama 3.3 70B. Pricing based on underlying model usage.',
    contextWindow: 131072,
    maxCompletionTokens: 8192,
@@ -78,6 +83,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'groq/compound-mini',
    label: 'Compound Mini (Agentic System)',
+    pubDate: '20250904',
    description: 'Lighter Groq agentic AI with web search, code execution. Pricing based on underlying model usage.',
    contextWindow: 131072,
    maxCompletionTokens: 8192,
@@ -89,6 +95,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'openai/gpt-oss-120b',
    label: 'GPT OSS 120B',
+    pubDate: '20250805',
    description: 'OpenAI flagship open-weight MoE (120B total, 5.1B active). Reasoning, browser search, code execution. 131K context, 65K max output. ~500 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 65536,
@@ -99,6 +106,7 @@ const _knownGroqModels: ManualMappings = [
    isPreview: true,
    idPrefix: 'openai/gpt-oss-safeguard-20b',
    label: 'GPT OSS Safeguard 20B (Preview)',
+    pubDate: '20251029',
    description: 'OpenAI safety classification model (20B MoE). Purpose-built for content moderation with Harmony response format. 131K context, 65K max output. ~1000 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 65536,
@@ -108,6 +116,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'openai/gpt-oss-20b',
    label: 'GPT OSS 20B',
+    pubDate: '20250805',
    description: 'OpenAI efficient open-weight MoE (20B total, 3.6B active). Tool use, browser search, code execution. 131K context, 65K max output. ~1000 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 65536,
@@ -120,6 +129,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'llama-3.3-70b-versatile',
    label: 'Llama 3.3 · 70B Versatile',
+    pubDate: '20241206',
    description: 'Meta Llama 3.3 (70B params) with GQA. Strong reasoning, coding, multilingual. 131K context, 32K max output. ~280 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 32768,
@@ -129,6 +139,7 @@ const _knownGroqModels: ManualMappings = [
  {
    idPrefix: 'llama-3.1-8b-instant',
    label: 'Llama 3.1 · 8B Instant',
+    pubDate: '20240723',
    description: 'Meta Llama 3.1 (8B params). Fast, cost-effective for high-volume tasks. 131K context and max output. ~560 t/s on Groq.',
    contextWindow: 131072,
    maxCompletionTokens: 131072,
@@ -22,6 +22,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.7',
    label: 'MiniMax M2.7',
+    pubDate: '20260318',
    description: 'Latest flagship with recursive self-improvement and agentic capabilities. 200K context, 131K max output. ~60 t/s.',
    contextWindow: 204800,
    maxCompletionTokens: 131072,
@@ -31,6 +32,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.7-highspeed',
    label: 'MiniMax M2.7 (Highspeed)',
+    pubDate: '20260318',
    description: 'Faster M2.7 variant at ~100 t/s. 200K context, 131K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 131072,
@@ -42,6 +44,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.5',
    label: 'MiniMax M2.5',
+    pubDate: '20260212',
    description: 'Strong coding and reasoning, best value. 200K context, 65K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 65536,
@@ -51,6 +54,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.5-highspeed',
    label: 'MiniMax M2.5 (Highspeed)',
+    pubDate: '20260212',
    description: 'Faster M2.5 variant at ~100 t/s. 200K context, 65K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 65536,
@@ -62,6 +66,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2-her',
    label: 'MiniMax M2-her',
+    pubDate: '20260127',
    description: 'Dialogue-first model for immersive roleplay, character-driven chat, and expressive multi-turn conversations. 64K context.',
    contextWindow: 65536,
    maxCompletionTokens: 2048,
@@ -73,6 +78,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.1',
    label: 'MiniMax M2.1',
+    pubDate: '20251223',
    description: '230B params (10B active), multilingual coding. 200K context, 65K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 65536,
@@ -83,6 +89,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2.1-highspeed',
    label: 'MiniMax M2.1 (Highspeed)',
+    pubDate: '20251223',
    description: 'Faster M2.1 variant. 200K context, 65K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 65536,
@@ -95,6 +102,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M2',
    label: 'MiniMax M2',
+    pubDate: '20251027',
    description: '230B params (10B active), agentic and reasoning. 200K context, 128K max output.',
    contextWindow: 204800,
    maxCompletionTokens: 128000,
@@ -107,6 +115,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-M1',
    label: 'MiniMax M1',
+    pubDate: '20250616',
    description: '456B total / 45.9B active MoE with lightning attention. 1M context, 40K max output.',
    contextWindow: 1000000,
    maxCompletionTokens: 40000,
@@ -119,6 +128,7 @@ const _knownMiniMaxModels: ModelDescriptionSchema[] = [
  {
    id: 'MiniMax-01',
    label: 'MiniMax 01',
+    pubDate: '20250114',
    description: 'Legacy flagship. 1M context.',
    contextWindow: 1000192,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
@@ -19,80 +19,81 @@ const DEV_DEBUG_MISTRAL_MODELS = Release.IsNodeDevBuild; // not in staging to re

 const _knownMistralModelDetails: Record<string, {
  label?: string; // override the API-provided name
+  pubDate?: string; // YYYYMMDD - earliest public availability (announcement / La Plateforme / HF upload)
  chatPrice?: { input: number; output: number };
  benchmark?: { cbaElo: number };
  hidden?: boolean;
 }> = {

  // Premier models - Mistral 3 (Dec 2025)
-  'mistral-large-2512': { chatPrice: { input: 0.5, output: 1.5 }, benchmark: { cbaElo: 1415 } }, // Mistral Large 3 - MoE 41B active / 675B total
-  'mistral-large-2411': { chatPrice: { input: 2, output: 6 }, benchmark: { cbaElo: 1305 }, hidden: true }, // older version
-  'mistral-large-latest': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // → 2512
+  'mistral-large-2512': { pubDate: '20251202', chatPrice: { input: 0.5, output: 1.5 }, benchmark: { cbaElo: 1415 } }, // Mistral Large 3 - MoE 41B active / 675B total
+  'mistral-large-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, benchmark: { cbaElo: 1305 }, hidden: true }, // older version
+  'mistral-large-latest': { pubDate: '20251202', chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // → 2512

-  'mistral-medium-2508': { chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1410 } }, // Mistral Medium 3
-  'mistral-medium-2505': { chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1387 }, hidden: true }, // older version
-  'mistral-medium-latest': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // → 2508
-  'mistral-medium': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
+  'mistral-medium-2508': { pubDate: '20250812', chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1410 } }, // Mistral Medium 3.1
+  'mistral-medium-2505': { pubDate: '20250507', chatPrice: { input: 0.4, output: 2 }, benchmark: { cbaElo: 1387 }, hidden: true }, // Mistral Medium 3
+  'mistral-medium-latest': { pubDate: '20250812', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // → 2508
+  'mistral-medium': { pubDate: '20231211', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink (legacy: original Mistral Medium prototype on La Plateforme beta)

-  'magistral-medium-2509': { chatPrice: { input: 2, output: 5 }, benchmark: { cbaElo: 1304 } }, // reasoning (leaderboard: magistral-medium-2506 = 1304)
-  'magistral-medium-latest': { chatPrice: { input: 2, output: 5 }, hidden: true }, // symlink
+  'magistral-medium-2509': { pubDate: '20250917', chatPrice: { input: 2, output: 5 }, benchmark: { cbaElo: 1304 } }, // reasoning (leaderboard: magistral-medium-2506 = 1304)
+  'magistral-medium-latest': { pubDate: '20250917', chatPrice: { input: 2, output: 5 }, hidden: true }, // symlink

-  'devstral-2512': { label: 'Devstral 2 (2512)', chatPrice: { input: 0.4, output: 2 } }, // Devstral 2 - 123B coding agents (API returns "Mistral Vibe Cli")
-  'devstral-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
-  'devstral-medium-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
-  'mistral-vibe-cli-latest': { label: 'Devstral 2 (latest)', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // alternate ID for devstral-latest
-  'devstral-medium-2507': { chatPrice: { input: 0.4, output: 2 }, hidden: true }, // older version
+  'devstral-2512': { label: 'Devstral 2 (2512)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 } }, // Devstral 2 - 123B coding agents (API returns "Mistral Vibe Cli")
+  'devstral-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
+  'devstral-medium-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // symlink
+  'mistral-vibe-cli-latest': { label: 'Devstral 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // alternate ID for devstral-latest
+  'devstral-medium-2507': { pubDate: '20250710', chatPrice: { input: 0.4, output: 2 }, hidden: true }, // older version

-  'mistral-large-pixtral-2411': { chatPrice: { input: 2, output: 6 } }, // Pixtral Large (alternate ID)
-  'pixtral-large-2411': { chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
-  'pixtral-large-latest': { chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
+  'mistral-large-pixtral-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 } }, // Pixtral Large (alternate ID)
+  'pixtral-large-2411': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink
+  'pixtral-large-latest': { pubDate: '20241118', chatPrice: { input: 2, output: 6 }, hidden: true }, // symlink

-  'codestral-2508': { chatPrice: { input: 0.3, output: 0.9 } }, // code generation
-  'codestral-latest': { chatPrice: { input: 0.3, output: 0.9 }, hidden: true }, // symlink
+  'codestral-2508': { pubDate: '20250730', chatPrice: { input: 0.3, output: 0.9 } }, // code generation (Codestral 25.08)
+  'codestral-latest': { pubDate: '20250730', chatPrice: { input: 0.3, output: 0.9 }, hidden: true }, // symlink

-  'voxtral-small-2507': { chatPrice: { input: 0.1, output: 0.3 } }, // voice (text tokens)
-  'voxtral-small-latest': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
+  'voxtral-small-2507': { pubDate: '20250715', chatPrice: { input: 0.1, output: 0.3 } }, // voice (text tokens)
+  'voxtral-small-latest': { pubDate: '20250715', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink

-  'voxtral-mini-2507': { chatPrice: { input: 0.04, output: 0.04 } }, // voice (text tokens)
-  'voxtral-mini-latest': { chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // symlink
+  'voxtral-mini-2507': { pubDate: '20250715', chatPrice: { input: 0.04, output: 0.04 } }, // voice (text tokens)
+  'voxtral-mini-latest': { pubDate: '20250715', chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // symlink

  // Ministral 3 family (Dec 2025) - multimodal, multilingual, Apache 2.0
-  'ministral-14b-2512': { chatPrice: { input: 0.2, output: 0.2 } }, // Ministral 3 14B
-  'ministral-14b-latest': { chatPrice: { input: 0.2, output: 0.2 }, hidden: true }, // symlink
+  'ministral-14b-2512': { pubDate: '20251202', chatPrice: { input: 0.2, output: 0.2 } }, // Ministral 3 14B
+  'ministral-14b-latest': { pubDate: '20251202', chatPrice: { input: 0.2, output: 0.2 }, hidden: true }, // symlink

-  'ministral-8b-2512': { chatPrice: { input: 0.15, output: 0.15 } }, // Ministral 3 8B
-  'ministral-8b-2410': { chatPrice: { input: 0.1, output: 0.1 }, benchmark: { cbaElo: 1237 }, hidden: true }, // older version
-  'ministral-8b-latest': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
+  'ministral-8b-2512': { pubDate: '20251202', chatPrice: { input: 0.15, output: 0.15 } }, // Ministral 3 8B
+  'ministral-8b-2410': { pubDate: '20241016', chatPrice: { input: 0.1, output: 0.1 }, benchmark: { cbaElo: 1237 }, hidden: true }, // older version
+  'ministral-8b-latest': { pubDate: '20251202', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink

-  'ministral-3b-2512': { chatPrice: { input: 0.1, output: 0.1 } }, // Ministral 3 3B
-  'ministral-3b-2410': { chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // older version
-  'ministral-3b-latest': { chatPrice: { input: 0.1, output: 0.1 }, hidden: true }, // symlink
+  'ministral-3b-2512': { pubDate: '20251202', chatPrice: { input: 0.1, output: 0.1 } }, // Ministral 3 3B
+  'ministral-3b-2410': { pubDate: '20241016', chatPrice: { input: 0.04, output: 0.04 }, hidden: true }, // older version
+  'ministral-3b-latest': { pubDate: '20251202', chatPrice: { input: 0.1, output: 0.1 }, hidden: true }, // symlink

  // Open models
-  'mistral-small-2603': { chatPrice: { input: 0.15, output: 0.6 } }, // Mistral Small 4 - 119B hybrid (instruct+reasoning+coding), 256k ctx
-  'mistral-small-2506': { chatPrice: { input: 0.1, output: 0.3 }, benchmark: { cbaElo: 1357 }, hidden: true }, // Mistral Small 3.2
-  'mistral-small-latest': { chatPrice: { input: 0.15, output: 0.6 }, hidden: true }, // → 2603
+  'mistral-small-2603': { pubDate: '20260316', chatPrice: { input: 0.15, output: 0.6 } }, // Mistral Small 4 - 119B hybrid (instruct+reasoning+coding), 256k ctx
+  'mistral-small-2506': { pubDate: '20250620', chatPrice: { input: 0.1, output: 0.3 }, benchmark: { cbaElo: 1357 }, hidden: true }, // Mistral Small 3.2
+  'mistral-small-latest': { pubDate: '20260316', chatPrice: { input: 0.15, output: 0.6 }, hidden: true }, // → 2603

-  'labs-mistral-small-creative': { label: 'Mistral Small Creative', chatPrice: { input: 0.1, output: 0.3 } }, // creative writing, roleplay (Labs)
+  'labs-mistral-small-creative': { label: 'Mistral Small Creative', pubDate: '20251211', chatPrice: { input: 0.1, output: 0.3 } }, // creative writing, roleplay (Labs)

-  'labs-leanstral-2603': { label: 'Leanstral (2603)', chatPrice: { input: 0, output: 0 } }, // Lean 4 formal proof engineering (Labs, free for limited period)
+  'labs-leanstral-2603': { label: 'Leanstral (2603)', pubDate: '20260316', chatPrice: { input: 0, output: 0 } }, // Lean 4 formal proof engineering (Labs, free for limited period)

-  'magistral-small-2509': { chatPrice: { input: 0.5, output: 1.5 } }, // reasoning
-  'magistral-small-latest': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // symlink
+  'magistral-small-2509': { pubDate: '20250917', chatPrice: { input: 0.5, output: 1.5 } }, // reasoning
+  'magistral-small-latest': { pubDate: '20250917', chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // symlink

-  'labs-devstral-small-2512': { label: 'Devstral Small 2 (2512)', chatPrice: { input: 0.1, output: 0.3 } }, // Devstral Small 2 - 24B coding agents (Labs)
-  'devstral-small-2507': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // older version
-  'devstral-small-latest': { label: 'Devstral Small 2 (latest)', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
+  'labs-devstral-small-2512': { label: 'Devstral Small 2 (2512)', pubDate: '20251209', chatPrice: { input: 0.1, output: 0.3 } }, // Devstral Small 2 - 24B coding agents (Labs)
+  'devstral-small-2507': { pubDate: '20250710', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // older version (Devstral Small 1.1)
+  'devstral-small-latest': { label: 'Devstral Small 2 (latest)', pubDate: '20251209', chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink

-  'pixtral-12b-2409': { chatPrice: { input: 0.15, output: 0.15 } }, // vision
-  'pixtral-12b-latest': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
-  'pixtral-12b': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
+  'pixtral-12b-2409': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 } }, // vision
+  'pixtral-12b-latest': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
+  'pixtral-12b': { pubDate: '20240911', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink

-  'open-mistral-nemo-2407': { chatPrice: { input: 0.15, output: 0.15 } }, // NeMo
-  'open-mistral-nemo': { chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink
+  'open-mistral-nemo-2407': { pubDate: '20240718', chatPrice: { input: 0.15, output: 0.15 } }, // NeMo
+  'open-mistral-nemo': { pubDate: '20240718', chatPrice: { input: 0.15, output: 0.15 }, hidden: true }, // symlink

  // Legacy (kept for reference, no longer in API)
-  'open-mistral-7b': { chatPrice: { input: 0.25, output: 0.25 }, hidden: true },
+  'open-mistral-7b': { pubDate: '20230927', chatPrice: { input: 0.25, output: 0.25 }, hidden: true },
 };


@@ -28,7 +28,8 @@ const _PS_Reasoning: ModelDescriptionSchema['parameterSpecs'] = [
 * Moonshot AI (Kimi) models.
 * - models list and pricing: https://platform.kimi.ai/docs/pricing/chat (was platform.moonshot.ai - now 301 redirect)
 * - API docs: https://platform.kimi.ai/docs/api/chat
- * - updated: 2026-04-20
+ * - updated: 2026-05-04
+ * - NOTE: K2 series (non-2.5/2.6) is scheduled for discontinuation on 2026-05-25 per Moonshot docs.
 */
 const _knownMoonshotModels: ManualMappings = [

@@ -36,6 +37,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'kimi-k2.6',
    label: 'Kimi K2.6',
+    pubDate: '20260420',
    description: 'Native multimodal flagship (text, image, video inputs) with thinking and non-thinking modes. Stronger long-form coding, improved instruction compliance and self-correction. 256K context.',
    contextWindow: 262144,
    maxCompletionTokens: 32768,
@@ -49,6 +51,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'kimi-k2.5',
    label: 'Kimi K2.5',
+    pubDate: '20260127',
    description: 'Supports vision (images/videos), thinking mode, and Agent tasks. 256K context.',
    contextWindow: 262144,
    maxCompletionTokens: 32768,
@@ -58,12 +61,13 @@ const _knownMoonshotModels: ManualMappings = [
    benchmark: { cbaElo: 1451 }, // kimi-k2.5-thinking
  },

-  // Kimi K2 Series - Latest Models
+  // Kimi K2 Series - scheduled for discontinuation on 2026-05-25

  // Fast, Thinking
  {
    idPrefix: 'kimi-k2-thinking-turbo',
    label: 'Kimi K2 Thinking Turbo',
+    pubDate: '20251106',
    description: 'High-speed reasoning model with advanced thinking and tool calling capabilities. Faster inference (~50 tok/s) with optimized performance. 256K context. Temperature 1.0 recommended.',
    contextWindow: 262144,
    maxCompletionTokens: 65536,
@@ -76,6 +80,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'kimi-k2-thinking',
    label: 'Kimi K2 Thinking',
+    pubDate: '20251106',
    description: 'Advanced reasoning model with multi-step thinking and autonomous tool calling (200-300 sequential calls). Interleaves chain-of-thought with tool use. 256K context. Temperature 1.0 recommended.',
    contextWindow: 262144,
    maxCompletionTokens: 65536,
@@ -89,6 +94,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'kimi-k2-0905-preview',
    label: 'Kimi K2 0905 (Preview)',
+    pubDate: '20250905',
    description: 'State-of-the-art MoE model (1T total, 32B active) with extended 256K context. Enhanced agentic coding intelligence and improved instruction following.',
    contextWindow: 262144,
    maxCompletionTokens: 32768,
@@ -102,6 +108,7 @@ const _knownMoonshotModels: ManualMappings = [
    hidden: true,
    idPrefix: 'kimi-k2-0711-preview',
    label: 'Kimi K2 0711 (Preview)',
+    pubDate: '20250711',
    description: 'Earlier preview variant with 128K context. Superseded by 0905 version.',
    contextWindow: 131072,
    maxCompletionTokens: 16384,
@@ -114,6 +121,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'kimi-k2-turbo-preview',
    label: 'Kimi K2 Turbo (Preview)',
+    pubDate: '20250801',
    description: 'High-speed variant with 60-100 tokens/second output. 256K context. Optimized for real-time applications and agentic tasks.',
    contextWindow: 262144,
    maxCompletionTokens: 32768,
@@ -127,6 +135,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'moonshot-v1-128k',
    label: 'V1 128K',
+    pubDate: '20240206',
    description: 'Legacy V1 model with 128K context. Deprecated - use Kimi K2 Instruct instead.',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
@@ -136,6 +145,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'moonshot-v1-32k',
    label: 'V1 32K',
+    pubDate: '20240206',
    description: 'Legacy V1 model with 32K context. Deprecated - use Kimi K2 Instruct instead.',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
@@ -145,6 +155,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'moonshot-v1-8k',
    label: 'V1 8K',
+    pubDate: '20240206',
    description: 'Legacy V1 model with 8K context. Deprecated - use Kimi K2 Instruct instead.',
    contextWindow: 8192,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
@@ -157,6 +168,7 @@ const _knownMoonshotModels: ManualMappings = [
    // hidden: false, not hidden - only non-hidden vision for now
    idPrefix: 'moonshot-v1-128k-vision-preview',
    label: 'V1 128K Vision (Preview)',
+    pubDate: '20250115',
    description: 'Legacy vision model with 128K context. Preview variant - use moonshot-v1-vision for production.',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -166,6 +178,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'moonshot-v1-32k-vision-preview',
    label: 'V1 32K Vision (Preview)',
+    pubDate: '20250115',
    description: 'Legacy vision model with 32K context. Preview variant - use moonshot-v1-vision for production.',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -176,6 +189,7 @@ const _knownMoonshotModels: ManualMappings = [
  {
    idPrefix: 'moonshot-v1-8k-vision-preview',
    label: 'V1 8K Vision (Preview)',
+    pubDate: '20250115',
    description: 'Legacy vision model with 8K context. Preview variant - use moonshot-v1-vision for production.',
    contextWindow: 8192,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -12,6 +12,23 @@ import { fromManualMapping, KnownModel, llmDevCheckModels_DEV, ManualMappings }
 // OpenAI Model Variants
 export const hardcodedOpenAIVariants: ModelVariantMap = {

+  // GPT-5.5 with reasoning disabled (non-thinking) - supports temperature control
+  'gpt-5.5-2026-04-23': {
+    idVariant: '::thinking-none',
+    label: 'GPT-5.5 (No-thinking)',
+    hidden: true, // hidden by default as redundant, user can unhide in settings
+    description: 'Supports temperature control for creative applications. GPT-5.5 with reasoning disabled (reasoning_effort=none).',
+    interfaces: [LLM_IF_OAI_Responses, LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_PromptCaching], // NO LLM_IF_OAI_Reasoning, NO LLM_IF_HOTFIX_NoTemperature
+    parameterSpecs: [
+      { paramId: 'llmVndOaiEffort', enumValues: ['none', 'low', 'medium', 'high', 'xhigh'], initialValue: 'none', hidden: true }, // factory 'none', not changeable
+      { paramId: 'llmVndOaiWebSearchContext' },
+      { paramId: 'llmVndOaiVerbosity' },
+      { paramId: 'llmVndOaiImageGeneration' },
+      { paramId: 'llmVndOaiCodeInterpreter' },
+      { paramId: 'llmForceNoStream' },
+    ],
+  },
+
  // GPT-5.4 with reasoning disabled (non-thinking) - supports temperature control
  'gpt-5.4-2026-03-05': {
    idVariant: '::thinking-none',
@@ -88,12 +105,67 @@ const PS_DEEP_RESEARCH = [{ paramId: 'llmVndOaiWebSearchContext' as const, initi
 // https://platform.openai.com/docs/pricing
 export const _knownOpenAIChatModels: ManualMappings = [

+  /// GPT-5.5 series - Released April 23, 2026
+
+  // GPT-5.5
+  {
+    idPrefix: 'gpt-5.5-2026-04-23',
+    label: 'GPT-5.5 (2026-04-23)',
+    pubDate: '20260423',
+    description: 'New baseline for complex production workflows. Stronger task execution, more precise tool use, more efficient reasoning with fewer tokens. 1M token context.',
+    contextWindow: 1050000,
+    maxCompletionTokens: 128000,
+    interfaces: [LLM_IF_OAI_Responses, ...IFS_CHAT_CACHE_REASON, LLM_IF_HOTFIX_NoTemperature],
+    parameterSpecs: [
+      { paramId: 'llmVndOaiEffort', enumValues: ['none', 'low', 'medium', 'high', 'xhigh'], initialValue: 'medium' }, // medium is the new default for 5.5
+      { paramId: 'llmVndOaiWebSearchContext' },
+      { paramId: 'llmVndOaiVerbosity' },
+      { paramId: 'llmVndOaiImageGeneration' },
+      { paramId: 'llmVndOaiCodeInterpreter' },
+      { paramId: 'llmForceNoStream' },
+    ],
+    chatPrice: { input: 5, cache: { cType: 'oai-ac', read: 0.5 }, output: 30 },
+    // benchmark: TBD - no CBA ELO yet
+  },
+  {
+    idPrefix: 'gpt-5.5',
+    label: 'GPT-5.5',
+    symLink: 'gpt-5.5-2026-04-23',
+  },
+
+  // GPT-5.5 Pro
+  {
+    idPrefix: 'gpt-5.5-pro-2026-04-23',
+    label: 'GPT-5.5 Pro (2026-04-23)',
+    pubDate: '20260423',
+    description: 'Most capable model for complex tasks. Uses more compute for smarter, more precise responses on the hardest problems.',
+    contextWindow: 1050000,
+    maxCompletionTokens: 272000,
+    interfaces: [LLM_IF_OAI_Responses, ...IFS_CHAT_MIN, LLM_IF_OAI_Reasoning, LLM_IF_HOTFIX_NoTemperature],
+    parameterSpecs: [
+      { paramId: 'llmVndOaiEffort', enumValues: ['medium', 'high', 'xhigh'] }, // Pro: no low/none
+      { paramId: 'llmVndOaiWebSearchContext' },
+      { paramId: 'llmVndOaiVerbosity' },
+      { paramId: 'llmVndOaiImageGeneration' },
+      { paramId: 'llmForceNoStream' },
+    ],
+    chatPrice: { input: 30, output: 180 },
+    // benchmark: TBD
+  },
+  {
+    idPrefix: 'gpt-5.5-pro',
+    label: 'GPT-5.5 Pro',
+    symLink: 'gpt-5.5-pro-2026-04-23',
+  },
+
+
  /// GPT-5.4 series - Released March 5, 2026

  // GPT-5.4
  {
    idPrefix: 'gpt-5.4-2026-03-05',
    label: 'GPT-5.4 (2026-03-05)',
+    pubDate: '20260305',
    description: 'Most capable and efficient frontier model for professional work. Native computer use, improved reasoning, coding, and agentic workflows with 1M token context.',
    contextWindow: 1050000,
    maxCompletionTokens: 128000,
@@ -119,6 +191,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-5.4-pro-2026-03-05',
    label: 'GPT-5.4 Pro (2026-03-05)',
+    pubDate: '20260305',
    description: 'Most capable model for complex tasks. Uses more compute for smarter, more precise responses on difficult problems.',
    contextWindow: 1050000,
    maxCompletionTokens: 272000,
@@ -143,6 +216,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-5.4-mini-2026-03-17',
    label: 'GPT-5.4 Mini (2026-03-17)',
+    pubDate: '20260317',
    description: 'Strongest mini model for coding, computer use, and subagents. GPT-5.4-class intelligence at lower cost and latency.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -168,6 +242,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-5.4-nano-2026-03-17',
    label: 'GPT-5.4 Nano (2026-03-17)',
+    pubDate: '20260317',
    description: 'Cheapest GPT-5.4-class model for simple high-volume tasks like classification and data extraction.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -196,6 +271,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-5.3-codex',
    label: 'GPT-5.3 Codex',
+    pubDate: '20260205',
    description: 'Most capable agentic coding model. Combines frontier coding performance of GPT-5.2-Codex with reasoning and professional knowledge of GPT-5.2. ~25% faster.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -216,6 +292,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // Research preview, ChatGPT Pro only - API access limited to design partners
    idPrefix: 'gpt-5.3-codex-spark',
    label: 'GPT-5.3 Codex Spark',
+    pubDate: '20260212',
    description: 'Text-only research preview optimized for real-time coding iteration. Delivers 1000+ tokens/sec on low-latency hardware.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -228,10 +305,11 @@ export const _knownOpenAIChatModels: ManualMappings = [
    // benchmark: TBD
  },

-  // GPT-5.3 Chat Latest - Released March 4, 2026
+  // GPT-5.3 Chat Latest - Released March 3, 2026
  {
    idPrefix: 'gpt-5.3-chat-latest',
    label: 'GPT-5.3 Instant',
+    pubDate: '20260303',
    description: 'GPT-5.3 model powering ChatGPT. Points to the GPT-5.3 Instant snapshot currently used in ChatGPT.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -250,8 +328,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.2
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5.2-2025-12-11',
    label: 'GPT-5.2 (2025-12-11)',
+    pubDate: '20251211',
    description: 'Most capable model for professional work and long-running agents. Improvements in general intelligence, long-context, agentic tool-calling, and vision.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -268,6 +348,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1441 }, // gpt-5.2-high
  },
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5.2',
    label: 'GPT-5.2',
    symLink: 'gpt-5.2-2025-12-11',
@@ -275,8 +356,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.2 Codex
  {
+    hidden: true, // superseded by GPT-5.3 Codex
    idPrefix: 'gpt-5.2-codex',
    label: 'GPT-5.2 Codex',
+    pubDate: '20251211',
    description: 'GPT-5.2 optimized for long-horizon, agentic coding tasks in Codex or similar environments. Supports low, medium, high, and xhigh reasoning effort settings.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -293,8 +376,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.2 Chat Latest
  {
+    hidden: true, // superseded by GPT-5.3 Instant
    idPrefix: 'gpt-5.2-chat-latest',
    label: 'GPT-5.2 Instant',
+    pubDate: '20251211',
    description: 'GPT-5.2 model powering ChatGPT. Fast, capable for everyday work with clear improvements in info-seeking, how-tos, technical writing.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -311,8 +396,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.2 Pro
  {
+    hidden: true, // superseded by GPT-5.4/5.5 Pro
    idPrefix: 'gpt-5.2-pro-2025-12-11',
    label: 'GPT-5.2 Pro (2025-12-11)',
+    pubDate: '20251211',
    description: 'Smartest and most trustworthy option for difficult questions. Uses more compute for harder thinking on complex domains like programming.',
    contextWindow: 400000,
    maxCompletionTokens: 272000,
@@ -328,6 +415,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    // benchmark: TBD
  },
  {
+    hidden: true, // superseded by GPT-5.4/5.5 Pro
    idPrefix: 'gpt-5.2-pro',
    label: 'GPT-5.2 Pro',
    symLink: 'gpt-5.2-pro-2025-12-11',
@@ -338,8 +426,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.1
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5.1-2025-11-13',
    label: 'GPT-5.1 (2025-11-13)',
+    pubDate: '20251113',
    description: 'The best model for coding and agentic tasks with configurable reasoning effort.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -355,6 +445,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1455 }, // gpt-5.1-high
  },
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5.1',
    label: 'GPT-5.1',
    symLink: 'gpt-5.1-2025-11-13',
@@ -362,8 +453,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.1 Chat Latest
  {
+    hidden: true, // superseded by GPT-5.3 Instant
    idPrefix: 'gpt-5.1-chat-latest',
    label: 'GPT-5.1 Instant',
+    pubDate: '20251112',
    description: 'GPT-5.1 Instant with adaptive reasoning. More conversational with improved instruction following.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -381,8 +474,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5.1 Codex Max
  {
+    hidden: true, // superseded by GPT-5.3 Codex
    idPrefix: 'gpt-5.1-codex-max',
    label: 'GPT-5.1 Codex Max',
+    pubDate: '20251119',
    description: 'Our most intelligent coding model optimized for long-horizon, agentic coding tasks.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -398,8 +493,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  },
  // GPT-5.1 Codex
  {
+    hidden: true, // superseded by GPT-5.3 Codex
    idPrefix: 'gpt-5.1-codex',
    label: 'GPT-5.1 Codex',
+    pubDate: '20251113',
    description: 'A version of GPT-5.1 optimized for agentic coding tasks in Codex or similar environments.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -415,8 +512,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  },
  // GPT-5.1 Codex Mini
  {
+    hidden: true, // superseded by GPT-5.3 Codex
    idPrefix: 'gpt-5.1-codex-mini',
    label: 'GPT-5.1 Codex Mini',
+    pubDate: '20251113',
    description: 'Smaller, faster version of GPT-5.1 Codex for efficient coding tasks.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -436,8 +535,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5-2025-08-07',
    label: 'GPT-5 (2025-08-07)',
+    pubDate: '20250807',
    description: 'The best model for coding and agentic tasks across domains.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -453,6 +554,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1433 }, // gpt-5-high
  },
  {
+    hidden: true, // superseded by GPT-5.4/5.5
    idPrefix: 'gpt-5',
    label: 'GPT-5',
    symLink: 'gpt-5-2025-08-07',
@@ -460,8 +562,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5 Pro
  {
+    hidden: true, // superseded by GPT-5.4/5.5 Pro
    idPrefix: 'gpt-5-pro-2025-10-06',
    label: 'GPT-5 Pro (2025-10-06)',
+    pubDate: '20251006',
    description: 'Version of GPT-5 that uses more compute to produce smarter and more precise responses. Designed for tough problems.',
    contextWindow: 400000,
    maxCompletionTokens: 272000,
@@ -471,6 +575,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    // benchmark: has not been measured yet
  },
  {
+    hidden: true, // superseded by GPT-5.4/5.5 Pro
    idPrefix: 'gpt-5-pro',
    label: 'GPT-5 Pro',
    symLink: 'gpt-5-pro-2025-10-06',
@@ -481,6 +586,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // deprecated per OpenAI docs (2026-04)
    idPrefix: 'gpt-5-chat-latest',
    label: 'GPT-5 ChatGPT (Non-Thinking)',
+    pubDate: '20250807',
    description: 'GPT-5 model used in ChatGPT. Points to the GPT-5 snapshot currently used in ChatGPT.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -495,6 +601,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // deprecated per OpenAI docs (2026-04), superseded by gpt-5.1-codex/gpt-5.3-codex
    idPrefix: 'gpt-5-codex',
    label: 'GPT-5 Codex',
+    pubDate: '20250915',
    description: 'A version of GPT-5 optimized for agentic coding in Codex.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -511,8 +618,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5 Search API
  {
+    hidden: true, // poor quality - use llmVndOaiWebSearchContext on regular models instead
    idPrefix: 'gpt-5-search-api-2025-10-14',
    label: 'GPT-5 Search API (2025-10-14)',
+    pubDate: '20251014',
    description: 'Updated web search model in Chat Completions API. 60% cheaper with domain filtering support.',
    contextWindow: 400000,
    maxCompletionTokens: 100000,
@@ -522,6 +631,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    // benchmark: TBD
  },
  {
+    hidden: true, // poor quality - use llmVndOaiWebSearchContext on regular models instead
    idPrefix: 'gpt-5-search-api',
    label: 'GPT-5 Search API',
    symLink: 'gpt-5-search-api-2025-10-14',
@@ -529,8 +639,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5 mini
  {
+    hidden: true, // superseded by GPT-5.4 Mini
    idPrefix: 'gpt-5-mini-2025-08-07',
    label: 'GPT-5 Mini (2025-08-07)',
+    pubDate: '20250807',
    description: 'A faster, more cost-efficient version of GPT-5 for well-defined tasks.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -540,6 +652,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1390 }, // gpt-5-mini-high
  },
  {
+    hidden: true, // superseded by GPT-5.4 Mini
    idPrefix: 'gpt-5-mini',
    label: 'GPT-5 Mini',
    symLink: 'gpt-5-mini-2025-08-07',
@@ -547,8 +660,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // GPT-5 nano
  {
+    hidden: true, // superseded by GPT-5.4 Nano
    idPrefix: 'gpt-5-nano-2025-08-07',
    label: 'GPT-5 Nano (2025-08-07)',
+    pubDate: '20250807',
    description: 'Fastest, most cost-efficient version of GPT-5 for summarization and classification tasks.',
    contextWindow: 400000,
    maxCompletionTokens: 128000,
@@ -558,6 +673,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1337 }, // gpt-5-nano-high
  },
  {
+    hidden: true, // superseded by GPT-5.4 Nano
    idPrefix: 'gpt-5-nano',
    label: 'GPT-5 Nano',
    symLink: 'gpt-5-nano-2025-08-07',
@@ -588,6 +704,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // UNSUPPORTED YET
    idPrefix: 'computer-use-preview-2025-03-11',
    label: 'Computer Use Preview (2025-03-11)',
+    pubDate: '20250311',
    description: 'Specialized model for computer use tool. Optimized for computer interaction capabilities.',
    contextWindow: 8192,
    maxCompletionTokens: 1024,
@@ -608,8 +725,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  // o4-mini-deep-research - (v1/responses API)
  {
    idPrefix: 'o4-mini-deep-research-2025-06-26',
-    label: 'o4 Mini Deep Research (2025-06-26)',
-    description: 'Faster, more affordable deep research model for complex, multi-step research tasks.',
+    label: 'o4 Mini Deep Research [Deprecated]',
+    pubDate: '20250626',
+    isLegacy: true,
+    description: 'Faster, more affordable deep research model for complex, multi-step research tasks. [Shutdown: 2026-07-23 - migrate to GPT-5.5 with web search.]',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
    interfaces: [LLM_IF_OAI_Responses, ...IFS_CHAT_CACHE_REASON, LLM_IF_HOTFIX_NoTemperature],
@@ -625,8 +744,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  /// o4-mini
  {
    idPrefix: 'o4-mini-2025-04-16',
-    label: 'o4 Mini (2025-04-16)',
-    description: 'Latest o4-mini model. Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks.',
+    label: 'o4 Mini [Deprecated]',
+    pubDate: '20250416',
+    isLegacy: true,
+    description: 'Latest o4-mini model. Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Mini.]',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
    interfaces: IFS_CHAT_CACHE_REASON,
@@ -643,8 +764,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  // o3-deep-research - (v1/responses API)
  {
    idPrefix: 'o3-deep-research-2025-06-26',
-    label: 'o3 Deep Research (2025-06-26)',
-    description: 'Our most powerful deep research model for complex, multi-step research tasks.',
+    label: 'o3 Deep Research [Deprecated]',
+    pubDate: '20250626',
+    isLegacy: true,
+    description: 'Our most powerful deep research model for complex, multi-step research tasks. [Shutdown: 2026-07-23 - migrate to GPT-5.5 Pro with web search.]',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
    interfaces: [LLM_IF_OAI_Responses, ...IFS_CHAT_CACHE_REASON, LLM_IF_HOTFIX_NoTemperature],
@@ -661,6 +784,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'o3-pro-2025-06-10',
    label: 'o3 Pro (2025-06-10)',
+    pubDate: '20250610',
    description: 'Version of o3 with more compute for better responses. Provides consistently better answers for complex tasks.',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
@@ -679,6 +803,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'o3-2025-04-16',
    label: 'o3 (2025-04-16)',
+    pubDate: '20250416',
    description: 'A well-rounded and powerful model across domains. Sets a new standard for math, science, coding, and visual reasoning tasks.',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
@@ -696,8 +821,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  // o3-mini
  {
    idPrefix: 'o3-mini-2025-01-31',
-    label: 'o3 Mini (2025-01-31)',
-    description: 'Latest o3-mini model snapshot. High intelligence at the same cost and latency targets of o1-mini. Excels at science, math, and coding tasks.',
+    label: 'o3 Mini [Deprecated]',
+    pubDate: '20250131',
+    isLegacy: true,
+    description: 'Latest o3-mini model snapshot. High intelligence at the same cost and latency targets of o1-mini. Excels at science, math, and coding tasks. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Mini.]',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_PromptCaching, LLM_IF_OAI_Reasoning, LLM_IF_HOTFIX_StripImages],
@@ -716,6 +843,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true,
    idPrefix: 'o1-pro-2025-03-19',
    label: 'o1 Pro (2025-03-19)',
+    pubDate: '20250319',
    description: 'A version of o1 with more compute for better responses. Provides consistently better answers for complex tasks.',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
@@ -733,8 +861,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  // o1
  {
    idPrefix: 'o1-2024-12-17',
-    label: 'o1 (2024-12-17)',
-    description: 'Previous full o-series reasoning model.',
+    label: 'o1 [Deprecated]',
+    pubDate: '20241217',
+    isLegacy: true,
+    description: 'Previous full o-series reasoning model. [Shutdown: 2026-10-23 - migrate to GPT-5.5 or o3.]',
    contextWindow: 200000,
    maxCompletionTokens: 100000,
    interfaces: IFS_CHAT_CACHE_REASON,
@@ -755,6 +885,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4.1-2025-04-14',
    label: 'GPT-4.1 (2025-04-14)',
+    pubDate: '20250414',
    description: 'Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.',
    contextWindow: 1047576,
    maxCompletionTokens: 32768,
@@ -772,6 +903,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4.1-mini-2025-04-14',
    label: 'GPT-4.1 Mini (2025-04-14)',
+    pubDate: '20250414',
    description: 'Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%.',
    contextWindow: 1047576,
    maxCompletionTokens: 32768,
@@ -788,8 +920,10 @@ export const _knownOpenAIChatModels: ManualMappings = [
  // GPT-4.1 nano
  {
    idPrefix: 'gpt-4.1-nano-2025-04-14',
-    label: 'GPT-4.1 Nano (2025-04-14)',
-    description: 'Fastest, most cost-effective GPT 4.1 model. Delivers exceptional performance with low latency, ideal for tasks like classification or autocompletion.',
+    label: 'GPT-4.1 Nano [Deprecated]',
+    pubDate: '20250414',
+    isLegacy: true,
+    description: 'Fastest, most cost-effective GPT 4.1 model. Delivers exceptional performance with low latency, ideal for tasks like classification or autocompletion. [Shutdown: 2026-10-23 - migrate to GPT-5.4 Nano.]',
    contextWindow: 1047576,
    maxCompletionTokens: 32768,
    interfaces: IFS_CHAT_CACHE,
@@ -809,6 +943,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-audio-1.5',
    label: 'GPT Audio 1.5',
+    pubDate: '20260224',
    description: 'Best voice model for audio in, audio out with Chat Completions. Accepts audio inputs and outputs.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -819,8 +954,10 @@ export const _knownOpenAIChatModels: ManualMappings = [

  // gpt-audio
  {
+    hidden: true, // superseded by GPT Audio 1.5
    idPrefix: 'gpt-audio-2025-08-28',
    label: 'GPT Audio (2025-08-28)',
+    pubDate: '20250828',
    description: 'First generally available audio model. Accepts audio inputs and outputs, and can be used in the Chat Completions REST API.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -829,6 +966,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    // benchmark: TBD
  },
  {
+    hidden: true, // superseded by GPT Audio 1.5
    idPrefix: 'gpt-audio',
    label: 'GPT Audio',
    symLink: 'gpt-audio-2025-08-28',
@@ -836,6 +974,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-audio-mini-2025-12-15',
    label: 'GPT Audio Mini (2025-12-15)',
+    pubDate: '20251215',
    description: 'Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -845,6 +984,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-audio-mini-2025-10-06',
    label: 'GPT Audio Mini (2025-10-06)',
+    pubDate: '20251006',
    hidden: true, // previous version
    description: 'Cost-efficient audio model. Accepts audio inputs and outputs via Chat Completions REST API.',
    contextWindow: 128000,
@@ -867,6 +1007,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4o-2024-11-20',
    label: 'GPT-4o (2024-11-20)',
+    pubDate: '20241120',
    description: 'Snapshot of gpt-4o from November 20th, 2024.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -877,6 +1018,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4o-2024-08-06',
    label: 'GPT-4o (2024-08-06)',
+    pubDate: '20240806',
    hidden: true, // previous version
    description: 'Snapshot that supports Structured Outputs. gpt-4o currently points to this version.',
    contextWindow: 128000,
@@ -888,6 +1030,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4o-2024-05-13',
    label: 'GPT-4o (2024-05-13)',
+    pubDate: '20240513',
    hidden: true, // previous version
    description: 'Original gpt-4o snapshot from May 13, 2024.',
    contextWindow: 128000,
@@ -908,6 +1051,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // old
    idPrefix: 'gpt-4o-search-preview-2025-03-11',
    label: 'GPT-4o Search Preview (2025-03-11)',
+    pubDate: '20250311',
    description: 'Latest snapshot of the GPT-4o model optimized for web search capabilities.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -928,6 +1072,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // old
    idPrefix: 'gpt-4o-audio-preview-2025-06-03',
    label: 'GPT-4o Audio Preview (2025-06-03)',
+    pubDate: '20250603',
    description: 'Latest snapshot for the Audio API model.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -940,6 +1085,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // old
    idPrefix: 'gpt-4o-audio-preview-2024-12-17',
    label: 'GPT-4o Audio Preview (2024-12-17)',
+    pubDate: '20241217',
    description: 'Snapshot for the Audio API model.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -958,6 +1104,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4o-mini-2024-07-18',
    label: 'GPT-4o Mini (2024-07-18)',
+    pubDate: '20240718',
    description: 'Affordable model for fast, lightweight tasks. GPT-4o Mini is cheaper and more capable than GPT-3.5 Turbo.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -974,6 +1121,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // UNSUPPORTED yet (audio output model)
    idPrefix: 'gpt-4o-mini-audio-preview-2024-12-17',
    label: 'GPT-4o Mini Audio Preview (2024-12-17)',
+    pubDate: '20241217',
    description: 'Snapshot for the Audio API model.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -992,6 +1140,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
    hidden: true, // old
    idPrefix: 'gpt-4o-mini-search-preview-2025-03-11',
    label: 'GPT-4o Mini Search Preview (2025-03-11)',
+    pubDate: '20250311',
    description: 'Latest snapshot of the GPT-4o Mini model optimized for web search capabilities.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -1011,6 +1160,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4-turbo-2024-04-09',
    label: 'GPT-4 Turbo (2024-04-09)',
+    pubDate: '20240409',
    hidden: true, // OLD
    description: 'GPT-4 Turbo with Vision model. Vision requests can now use JSON mode and function calling. gpt-4-turbo currently points to this version.',
    contextWindow: 128000,
@@ -1027,6 +1177,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4-0125-preview',
    label: 'GPT-4 Turbo (0125)',
+    pubDate: '20240125',
    hidden: true, // OLD
    description: 'GPT-4 Turbo preview model intended to reduce cases of "laziness" where the model doesn\'t complete a task.',
    contextWindow: 128000,
@@ -1038,6 +1189,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4-1106-preview', // GPT-4 Turbo preview model
    label: 'GPT-4 Turbo (1106)',
+    pubDate: '20231106',
    hidden: true, // OLD
    description: 'GPT-4 Turbo preview model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
    contextWindow: 128000,
@@ -1057,6 +1209,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4-0613',
    label: 'GPT-4 (0613)',
+    pubDate: '20230613',
    hidden: true, // OLD
    description: 'Snapshot of gpt-4 from June 13th 2023 with improved function calling support. Data up to Sep 2021.',
    contextWindow: 8192,
@@ -1068,6 +1221,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-4-0314',
    label: 'GPT-4 (0314)',
+    pubDate: '20230314',
    hidden: true, // OLD
    description: 'Snapshot of gpt-4 from March 14th 2023 with function calling data. Data up to Sep 2021.',
    contextWindow: 8192,
@@ -1090,6 +1244,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-3.5-turbo-0125',
    label: '3.5-Turbo (2024-01-25)',
+    pubDate: '20240125',
    hidden: true, // OLD
    description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.',
    contextWindow: 16385,
@@ -1101,6 +1256,7 @@ export const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-3.5-turbo-1106',
    label: '3.5-Turbo (1106)',
+    pubDate: '20231106',
    hidden: true, // OLD
    description: 'GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
    contextWindow: 16385,
@@ -1220,6 +1376,12 @@ export function openAIInjectVariants(acc: ModelDescriptionSchema[], model: Model


 const _manualOrderingIdPrefixes = [
+  // GPT-5.5
+  'gpt-5.5-20',
+  'gpt-5.5-pro-20',
+  'gpt-5.5-pro',
+  'gpt-5.5-chat-latest',
+  'gpt-5.5',
  // GPT-5.4
  'gpt-5.4-20',
  'gpt-5.4-pro-20',
@@ -1419,6 +1581,7 @@ export function llmOrtOaiLookup(orModelName: string): OrtVendorLookupResult | un
  // typemap to known models
  const ortOaiRefMap: Record<string, string | null> = {
    // renames
+    'gpt-5.5-chat': 'gpt-5.5-2026-04-23', // no chat-latest yet, map to snapshot
    'gpt-5.4-chat': 'gpt-5.4-2026-03-05', // no chat-latest yet, map to snapshot
    'gpt-5.3-chat': 'gpt-5.3-chat-latest',
    'gpt-5.2-chat': 'gpt-5.2-chat-latest',
@@ -1453,5 +1616,5 @@ export function llmOrtOaiLookup(orModelName: string): OrtVendorLookupResult | un

  // initialTemperature: not set - OpenAI models use the global fallback (0.5);
  // NoTemperature models are handled client-side via LLM_IF_HOTFIX_NoTemperature (not propagated to OR)
-  return { interfaces, parameterSpecs };
+  return { interfaces, parameterSpecs, pubDate: entry.pubDate };
 }
@@ -12,6 +12,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gpt-4.1-2025-04-14',
    label: '💾➜ GPT-4.1 (2025-04-14)',
+    pubDate: '20250414',
    description: 'Flagship GPT model for complex tasks. Major improvements on coding, instruction following, and long context with 1M token context window.',
    contextWindow: 1047576,
    maxCompletionTokens: 32768,
@@ -22,6 +23,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gpt-4.1-mini-2025-04-14',
    label: '💾➜ GPT-4.1 Mini (2025-04-14)',
+    pubDate: '20250414',
    description: 'Balanced for intelligence, speed, and cost. Matches or exceeds GPT-4o in intelligence while reducing latency and cost.',
    contextWindow: 1047576,
    maxCompletionTokens: 32768,
@@ -32,6 +34,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gpt-4o-mini-2024-07-18',
    label: '💾➜ GPT-4o Mini (2024-07-18)',
+    pubDate: '20240718',
    description: 'Affordable model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -41,6 +44,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gpt-4o-2024-08-06',
    label: '💾➜ GPT-4o (2024-08-06)',
+    pubDate: '20240806',
    description: 'Advanced, multimodal flagship model that\'s cheaper and faster than GPT-4 Turbo.',
    contextWindow: 128000,
    maxCompletionTokens: 16384,
@@ -51,6 +55,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gpt-3.5-turbo-0125',
    label: '💾➜ GPT-3.5 Turbo (0125)',
+    pubDate: '20240125',
    description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats',
    contextWindow: 16385,
    maxCompletionTokens: 4096,
@@ -63,6 +68,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gemini-1.0-pro-001',
    label: '💾➜ Gemini 1.0 Pro',
+    pubDate: '20240215',
    description: 'Google\'s Gemini 1.0 Pro model',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
@@ -70,6 +76,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'gemini-1.5-flash-001',
    label: '💾➜ Gemini 1.5 Flash',
+    pubDate: '20240514',
    description: 'Google\'s Gemini 1.5 Flash model - fast and efficient',
    contextWindow: 1000000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
@@ -79,6 +86,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Meta-Llama-3.1-8B-Instruct',
    label: '💾 Llama 3.1 · 8B Instruct',
+    pubDate: '20240723',
    description: 'Meta Llama 3.1 8B Instruct - hosted inference with per-token pricing',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -87,6 +95,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Meta-Llama-3.1-70B-Instruct',
    label: '💾 Llama 3.1 · 70B Instruct',
+    pubDate: '20240723',
    description: 'Meta Llama 3.1 70B Instruct - hosted inference with per-token pricing',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -95,6 +104,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Llama-3.1-8B',
    label: '💾 Llama 3.1 · 8B Base',
+    pubDate: '20240723',
    description: 'Meta Llama 3.1 8B base model for fine-tuning',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat],
@@ -102,6 +112,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Llama-3.1-70B',
    label: '💾 Llama 3.1 · 70B Base',
+    pubDate: '20240723',
    description: 'Meta Llama 3.1 70B base model for fine-tuning',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat],
@@ -111,6 +122,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Llama-3.2-1B-Instruct',
    label: '💾 Llama 3.2 · 1B Instruct',
+    pubDate: '20240925',
    description: 'Meta Llama 3.2 1B Instruct - lightweight model for edge and mobile deployment',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -118,6 +130,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Llama-3.2-3B-Instruct',
    label: '💾 Llama 3.2 · 3B Instruct',
+    pubDate: '20240925',
    description: 'Meta Llama 3.2 3B Instruct - efficient model for edge deployment',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -127,6 +140,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'meta-llama/Llama-3.3-70B-Instruct',
    label: '💾 Llama 3.3 · 70B Instruct',
+    pubDate: '20241206',
    description: 'Meta Llama 3.3 70B Instruct - latest 70B model with performance comparable to Llama 3.1 405B',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -136,6 +150,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2-VL-7B-Instruct',
    label: '💾 Qwen 2 · VL 7B Instruct',
+    pubDate: '20240830',
    description: 'Alibaba Qwen 2 Vision-Language 7B Instruct - multimodal model for text and image understanding',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -145,6 +160,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-1.5B-Instruct',
    label: '💾 Qwen 2.5 · 1.5B Instruct',
+    pubDate: '20240919',
    description: 'Alibaba Qwen 2.5 1.5B Instruct - efficient small model',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -152,6 +168,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-7B-Instruct',
    label: '💾 Qwen 2.5 · 7B Instruct',
+    pubDate: '20240919',
    description: 'Alibaba Qwen 2.5 7B Instruct - balanced performance and efficiency',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -159,6 +176,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-14B-Instruct',
    label: '💾 Qwen 2.5 · 14B Instruct',
+    pubDate: '20240919',
    description: 'Alibaba Qwen 2.5 14B Instruct - hosted inference (hourly compute unit pricing)',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -166,6 +184,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-72B-Instruct',
    label: '💾 Qwen 2.5 · 72B Instruct',
+    pubDate: '20240919',
    description: 'Alibaba Qwen 2.5 72B Instruct - flagship model with performance comparable to Llama 3.1 405B',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -173,6 +192,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-Coder-7B-Instruct',
    label: '💾 Qwen 2.5 · Coder 7B Instruct',
+    pubDate: '20241112',
    description: 'Alibaba Qwen 2.5 Coder 7B Instruct - specialized for code generation and understanding',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -180,6 +200,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen2.5-Coder-32B-Instruct',
    label: '💾 Qwen 2.5 · Coder 32B Instruct',
+    pubDate: '20241112',
    description: 'Alibaba Qwen 2.5 Coder 32B Instruct - specialized for code generation and understanding',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -189,6 +210,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen3-8B',
    label: '💾 Qwen 3 · 8B Base',
+    pubDate: '20250429',
    description: 'Alibaba Qwen 3 8B base model for fine-tuning - supports thinking and non-thinking modes',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
@@ -196,6 +218,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'Qwen/Qwen3-14B',
    label: '💾 Qwen 3 · 14B Base',
+    pubDate: '20250429',
    description: 'Alibaba Qwen 3 14B base model for fine-tuning - supports thinking and non-thinking modes',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
@@ -205,6 +228,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'google/gemma-3-1b-it',
    label: '💾 Gemma 3 · 1B IT',
+    pubDate: '20250312',
    description: 'Google Gemma 3 1B instruction-tuned - lightweight text-only model',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
@@ -212,6 +236,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'google/gemma-3-4b-it',
    label: '💾 Gemma 3 · 4B IT',
+    pubDate: '20250312',
    description: 'Google Gemma 3 4B instruction-tuned - efficient multimodal model with 128K context',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -219,6 +244,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'google/gemma-3-12b-it',
    label: '💾 Gemma 3 · 12B IT',
+    pubDate: '20250312',
    description: 'Google Gemma 3 12B instruction-tuned - balanced multimodal model with 128K context',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -226,6 +252,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'google/gemma-3-27b-it',
    label: '💾 Gemma 3 · 27B IT',
+    pubDate: '20250312',
    description: 'Google Gemma 3 27B instruction-tuned - largest Gemma 3 multimodal model with 128K context',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
@@ -235,6 +262,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'mistralai/Mistral-Nemo-Base-2407',
    label: '💾 Mistral Nemo · Base',
+    pubDate: '20240718',
    description: 'Mistral Nemo 12B base model (July 2024) for fine-tuning',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat],
@@ -242,6 +270,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
  {
    id: 'mistralai/Mistral-Small-24B-Base-2501',
    label: '💾 Mistral Small · 24B Base',
+    pubDate: '20250130',
    description: 'Mistral Small 24B base model (Jan 2025) - competitive with larger models while faster',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
@@ -162,8 +162,11 @@ export function openRouterModelToModelDescription(wireModel: object): ModelDescr
  // -- Vendor parameter & interface inheritance --
  const llmRef = model.id.replace(/^[^/]+\//, '');
  let initialTemperature: number | undefined;
+  let pubDate: string | undefined;

  const _mergeLookup = (lookup: OrtVendorLookupResult | undefined) => {
+    if (lookup?.pubDate !== undefined)
+      pubDate = lookup.pubDate;
    if (lookup?.interfaces)
      for (const iface of lookup.interfaces)
        if (!interfaces.includes(iface))
@@ -246,7 +249,10 @@ export function openRouterModelToModelDescription(wireModel: object): ModelDescr
      // 0-day: xAI/Grok/Moonshot/Z.ai/DeepSeek models get default reasoning effort if not inherited
      if (interfaces.includes(LLM_IF_OAI_Reasoning) && !parameterSpecs.some(p => p.paramId === 'llmVndMiscEffort')) {
        // console.log('[DEV] openRouterModelToModelDescription: unexpected xAI/Grok/DeepSeek reasoning model:', model.id);
-        parameterSpecs.push({ paramId: 'llmVndMiscEffort' }); // binary thinking for these vendors
+        // Binary thinking only: OpenRouter's unified reasoning API currently rejects 'max' (see openai.chatCompletions.ts).
+        // We pin enumValues here so the shared llmVndMiscEffort registry (which also includes 'max' for native DeepSeek V4)
+        // does not surface 'max' in the UI for OR-routed models that can't honor it.
+        parameterSpecs.push({ paramId: 'llmVndMiscEffort', enumValues: ['none', 'high'] });
      }
      break;

@@ -267,6 +273,7 @@ export function openRouterModelToModelDescription(wireModel: object): ModelDescr
    idPrefix: model.id,
    // latest: ...
    label,
+    ...(pubDate !== undefined && { pubDate }),
    description: model.description?.length > 280 ? model.description.slice(0, 277) + '...' : model.description,
    contextWindow,
    maxCompletionTokens,
@@ -39,6 +39,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
  {
    id: 'sonar-deep-research',
    label: 'Sonar Deep Research',
+    pubDate: '20250214',
    description: 'Expert-level research model for exhaustive searches and comprehensive reports. 128k context.',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning],
@@ -59,6 +60,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
  {
    id: 'sonar-reasoning-pro',
    label: 'Sonar Reasoning Pro',
+    pubDate: '20250218',
    description: 'Premier reasoning model (DeepSeek R1) with Chain of Thought. 128k context.',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Reasoning],
@@ -78,6 +80,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
  {
    id: 'sonar-pro',
    label: 'Sonar Pro',
+    pubDate: '20250121',
    description: 'Advanced search model for complex queries and deep content understanding. 200k context.',
    contextWindow: 200000,
    maxCompletionTokens: 8000,
@@ -96,6 +99,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
  {
    id: 'sonar',
    label: 'Sonar',
+    pubDate: '20250121',
    description: 'Lightweight, cost-effective search model for quick, grounded answers. 128k context.',
    contextWindow: 128000,
    interfaces: [LLM_IF_OAI_Chat],
@@ -16,7 +16,14 @@ const DEV_DEBUG_XAI_MODELS = (Release.TenantSlug as any) === 'staging' /* ALSO I

 // Known xAI Models - Manual Mappings
 // List on: https://docs.x.ai/docs/models?cluster=us-east-1
-// Verified: 2026-04-16
+// Verified: 2026-05-03
+
+// Flat pricing for Grok 4.3 flagship (April 2026)
+const PRICE_43 = {
+  input: 1.25,
+  output: 2.5,
+  cache: { cType: 'oai-ac' as const, read: 0.2 },
+};

 // Flat pricing for Grok 4.20 flagship models
 const PRICE_420 = {
@@ -82,11 +89,27 @@ const XAI_PAR_Pre4: ModelDescriptionSchema['parameterSpecs'] = [] as const;

 const _knownXAIChatModels: ManualMappings = [

-  // Grok 4.20 (flagship, March 2026) - note: model IDs use dot (4.20), unlike earlier models
+  // Grok 4.3 (flagship, April 2026) - always-on reasoning, no reasoning_effort support
  {
+    idPrefix: 'grok-4.3',
+    label: 'Grok 4.3',
+    pubDate: '20260417',
+    description: 'xAI\'s latest flagship model with always-on reasoning and a 1M token context window. Supports text, image, and video inputs with improved agentic performance at lower cost.',
+    contextWindow: 1000000,
+    maxCompletionTokens: undefined,
+    interfaces: [...XAI_IF_Vision, LLM_IF_OAI_Reasoning],
+    parameterSpecs: XAI_PAR, // no reasoning_effort - always-on reasoning
+    chatPrice: PRICE_43,
+    benchmark: { cbaElo: 1456 }, // grok-4.3
+  },
+
+  // Grok 4.20 (flagship, March 2026) - superseded by 4.3
+  {
+    hidden: true, // yield to 4.3
    idPrefix: 'grok-4.20-0309-reasoning',
    label: 'Grok 4.20 Reasoning',
-    description: 'xAI\'s most advanced flagship reasoning model with a 2M token context window. Deep reasoning and problem-solving capabilities with text and image inputs.',
+    pubDate: '20260309',
+    description: 'xAI\'s previous flagship reasoning model with a 2M token context window. Deep reasoning and problem-solving capabilities with text and image inputs.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
    interfaces: [...XAI_IF_Vision, LLM_IF_OAI_Reasoning],
@@ -95,9 +118,11 @@ const _knownXAIChatModels: ManualMappings = [
    benchmark: { cbaElo: 1480 }, // grok-4.20-beta-0309-reasoning (CBA name)
  },
  {
+    hidden: true, // yield to 4.3
    idPrefix: 'grok-4.20-0309-non-reasoning',
    label: 'Grok 4.20',
-    description: 'xAI\'s most advanced flagship model with a 2M token context window. Non-reasoning variant for fast, high-quality responses with text and image inputs.',
+    pubDate: '20260309',
+    description: 'xAI\'s previous flagship model with a 2M token context window. Non-reasoning variant for fast, high-quality responses with text and image inputs.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
    interfaces: XAI_IF_Vision,
@@ -108,6 +133,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-4.20-multi-agent-0309',
    label: 'Grok 4.20 Multi-Agent',
+    pubDate: '20260309',
    description: 'Multi-agent reasoning model that runs 4 specialized agents in parallel (coordinator, fact-checker, analyst, challenger) for collaborative verification with reduced hallucination.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
@@ -125,6 +151,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-4-1-fast-reasoning',
    label: 'Grok 4.1 Fast Reasoning',
+    pubDate: '20251119',
    description: 'Next generation frontier multimodal model optimized for high-performance agentic tool calling with a 2M token context window. Trained specifically for real-world enterprise use cases with exceptional performance on agentic workflows.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
@@ -136,6 +163,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-4-1-fast-non-reasoning',
    label: 'Grok 4.1 Fast', // 'Grok 4.1 Fast Non-Reasoning'
+    pubDate: '20251119',
    description: 'Next generation frontier multimodal model optimized for high-performance agentic tool calling with a 2M token context window. Non-reasoning variant for instant responses.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
@@ -150,6 +178,7 @@ const _knownXAIChatModels: ManualMappings = [
    hidden: true, // yield to 4.1
    idPrefix: 'grok-4-fast-reasoning',
    label: 'Grok 4 Fast Reasoning',
+    pubDate: '20250919',
    description: 'Cost-efficient reasoning model with a 2M token context window. Optimized for fast reasoning in agentic workflows. 98% cost reduction vs Grok 4 with comparable performance.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
@@ -162,6 +191,7 @@ const _knownXAIChatModels: ManualMappings = [
    hidden: true, // yield to 4.1
    idPrefix: 'grok-4-fast-non-reasoning',
    label: 'Grok 4 Fast', // 'Grok 4 Fast Non-Reasoning'
+    pubDate: '20250919',
    description: 'Cost-efficient non-reasoning model with a 2M token context window. Same weights as grok-4-fast-reasoning but constrained by non-reasoning system prompt for quick responses.',
    contextWindow: 2000000,
    maxCompletionTokens: undefined,
@@ -174,6 +204,7 @@ const _knownXAIChatModels: ManualMappings = [
    hidden: true, // yield to 4.20
    idPrefix: 'grok-4-0709',
    label: 'Grok 4 (0709)',
+    pubDate: '20250709',
    description: 'xAI\'s most advanced model, offering state-of-the-art reasoning and problem-solving capabilities over a massive 256k context window. Supports text and image inputs.',
    contextWindow: 256000,
    maxCompletionTokens: undefined,
@@ -187,6 +218,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-3',
    label: 'Grok 3',
+    pubDate: '20250217',
    description: 'xAI flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science.',
    contextWindow: 131072,
    maxCompletionTokens: undefined,
@@ -198,6 +230,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-3-mini',
    label: 'Grok 3 Mini',
+    pubDate: '20250217',
    description: 'A lightweight model that is fast and smart for logic-based tasks. Supports function calling and structured outputs.',
    contextWindow: 131072,
    maxCompletionTokens: undefined,
@@ -214,6 +247,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-code-fast-1',
    label: 'Grok Code Fast 1',
+    pubDate: '20250828',
    description: 'Specialized reasoning model for agentic coding workflows. Fast, economical, and optimized for code generation, debugging, and software development tasks.',
    contextWindow: 256000,
    maxCompletionTokens: undefined,
@@ -227,6 +261,7 @@ const _knownXAIChatModels: ManualMappings = [
  {
    idPrefix: 'grok-2-vision-1212',
    label: 'Grok 2 Vision (1212)',
+    pubDate: '20241212',
    description: 'xAI model grok-2-vision-1212 with image and text input capabilities. Supports text generation with a 32,768 token context window.',
    contextWindow: 32768,
    maxCompletionTokens: undefined,
@@ -320,6 +355,7 @@ export async function xaiFetchModelDescriptions(access: OpenAIAccessSchema): Pro

 // manual sort order - your desired order
 const _xaiIdStartsWithOrder = [
+  'grok-4.3',
  'grok-4.20-0309-reasoning',
  'grok-4.20-0309-non-reasoning',
  'grok-4.20-multi-agent-0309',
@@ -32,6 +32,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-5',
    label: 'GLM-5',
+    pubDate: '20260211',
    description: 'Z.ai flagship foundation model (744B MoE, 40B activated). Designed for Agentic Engineering with SOTA coding and agent capabilities. 200K context, thinking mode.',
    contextWindow: 204800, // 200K
    interfaces: _IF_Reasoning,
@@ -43,6 +44,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-5-code',
    label: 'GLM-5 Code',
+    // pubDate: UNCONFIRMED - 'glm-5-code' not in Z.ai pricing table or release-notes; Z.ai's coding plan documents GLM-5.1 / GLM-5-Turbo / GLM-4.7 / GLM-4.5-Air, no 'glm-5-code'
    description: 'GLM-5 optimized for coding tasks. Uses the dedicated Coding endpoint. 200K context, thinking mode.',
    contextWindow: 204800, // 200K
    interfaces: _IF_Reasoning,
@@ -58,6 +60,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.7',
    label: 'GLM-4.7',
+    pubDate: '20251222',
    description: 'Latest-gen GLM model with 128K context. Thinking mode activated by default.',
    contextWindow: 131072, // 128K
    interfaces: _IF_Reasoning,
@@ -69,6 +72,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.7-flashx',
    label: 'GLM-4.7 FlashX', // fast, low cost
+    pubDate: '20260119',
    description: 'Fast GLM-4.7 variant with priority routing and higher concurrency. Same model as Flash, better infrastructure.',
    contextWindow: 131072,
    interfaces: _IF_Reasoning,
@@ -80,6 +84,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.7-flash',
    label: 'GLM-4.7 Flash (Free)',
+    pubDate: '20260119',
    description: 'Free GLM-4.7 variant. Same model as FlashX but with limited concurrency (1 concurrent request) and lower priority.',
    contextWindow: 131072,
    interfaces: _IF_Reasoning,
@@ -94,6 +99,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.6v-flashx',
    label: 'GLM-4.6 V FlashX',
+    pubDate: '20251208',
    description: 'Fast vision GLM-4.6 with priority routing and higher concurrency. Image/video/file inputs, 32K output.',
    contextWindow: 131072,
    interfaces: _IF_Vision_Reasoning,
@@ -106,6 +112,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.6v-flash',
    label: 'GLM-4.6 V Flash (Free)',
+    pubDate: '20251208',
    description: 'Free vision GLM-4.6. Same model as FlashX but with limited concurrency (1 concurrent request). Image/video/file inputs, 32K output.',
    contextWindow: 131072,
    interfaces: _IF_Vision_Reasoning,
@@ -117,6 +124,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.6v',
    label: 'GLM-4.6 V',
+    pubDate: '20251208',
    description: 'Vision-enabled GLM-4.6 model. Supports image/video/file inputs, 32K output, hybrid thinking.',
    contextWindow: 131072,
    interfaces: _IF_Vision_Reasoning,
@@ -131,6 +139,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.6',
    label: 'GLM-4.6',
+    pubDate: '20250930',
    description: 'GLM-4.6 model with 128K context/output. Hybrid thinking: auto-determines whether to engage deep reasoning.',
    contextWindow: 131072,
    interfaces: _IF_Reasoning,
@@ -144,6 +153,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-ocr',
    label: 'GLM-OCR (Vision, OCR)',
+    pubDate: '20260203',
    description: 'Specialized OCR model for text extraction from images and documents.',
    contextWindow: 131072,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_HOTFIX_NoWebP],
@@ -158,6 +168,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5v',
    label: 'GLM-4.5 V',
+    pubDate: '20250811',
    description: 'Vision-enabled GLM-4.5 model. 96K context, 16K output, interleaved thinking.',
    contextWindow: 98304, // 96K
    interfaces: _IF_Vision_Reasoning,
@@ -173,6 +184,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5-flash',
    label: 'GLM-4.5 Flash (Free)',
+    pubDate: '20250728',
    description: 'Free GLM-4.5 variant with limited concurrency. Prior-gen, superseded by GLM-4.7 Flash.',
    contextWindow: 98304,
    interfaces: _IF_Reasoning,
@@ -185,6 +197,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5-airx',
    label: 'GLM-4.5 AirX',
+    pubDate: '20250728',
    description: 'Extended lightweight GLM-4.5 variant. Interleaved thinking.',
    contextWindow: 98304,
    interfaces: _IF_Reasoning,
@@ -197,6 +210,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5-air',
    label: 'GLM-4.5 Air',
+    pubDate: '20250728',
    description: 'Lightweight GLM-4.5 variant. Interleaved thinking.',
    contextWindow: 98304,
    interfaces: _IF_Reasoning,
@@ -209,6 +223,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5-x',
    label: 'GLM-4.5 X',
+    pubDate: '20250728',
    description: 'Extended GLM-4.5 model. Interleaved thinking.',
    contextWindow: 98304,
    interfaces: _IF_Reasoning,
@@ -221,6 +236,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4.5',
    label: 'GLM-4.5',
+    pubDate: '20250728',
    description: 'Prior-gen GLM-4.5 model with 96K context/output. Interleaved thinking.',
    contextWindow: 98304,
    interfaces: _IF_Reasoning,
@@ -234,6 +250,7 @@ const _knownZAIModels: ManualMappings = [
  {
    idPrefix: 'glm-4-32b-0414-128k',
    label: 'GLM-4 32B (0414) 128K',
+    pubDate: '20250414',
    description: 'GLM-4 32B model with 128K context, 16K output.',
    contextWindow: 131072,
    interfaces: _IF_Chat,
@@ -6,4 +6,7 @@ cd "$(dirname "$0")/../../.."

 # Run with npx tsx (will download on-demand if needed)
 # Uses npx cache, lightweight and no local install required
-exec npx -y tsx tools/data/llms/llm-registry-sync.ts "$@"
+npx -y tsx tools/data/llms/llm-registry-sync.ts "$@"
+
+# Then dump a fresh JSON snapshot next to the DB.
+exec npx -y tsx tools/data/llms/llm-registry-sync.ts --export-db tools/data/llms/llm-registry.json
@@ -41,6 +41,7 @@ interface CliOptions {
  discordWebhook?: string;
  notifyFilters?: string;
  validate?: boolean;
+  exportDbPath?: string;  // --export-db <path>: read-only DB dump (no API calls, no sync)
 }

 interface StoredModel {
@@ -53,6 +54,7 @@ interface StoredModel {
  deleted_at: string | null;
  created: number | null;
  updated: number | null;
+  pub_date: string | null;
  context_window: number | null;
  max_completion_tokens: number | null;
  interfaces: string | null;
@@ -90,6 +92,13 @@ function extractSimplePrice(price: any): number | null {
  return null;
 }

+/** Idempotent schema migration: adds a column if it doesn't already exist. Safe to call on every run. */
+function ensureColumn(db: DatabaseSync, table: string, column: string, columnDef: string): void {
+  const cols = db.prepare(`PRAGMA table_info(${table})`).all() as Array<{ name: string }>;
+  if (!cols.some((c) => c.name === column))
+    db.exec(`ALTER TABLE ${table} ADD COLUMN ${column} ${columnDef}`);
+}
+
 function initDatabase(): DatabaseSync {
  const db = new DatabaseSync(DB_PATH);

@@ -105,6 +114,7 @@ function initDatabase(): DatabaseSync {
          deleted_at            TEXT,
          created               INTEGER,
          updated               INTEGER,
+          pub_date              TEXT,
          context_window        INTEGER,
          max_completion_tokens INTEGER,
          interfaces            TEXT,
@@ -131,6 +141,9 @@ function initDatabase(): DatabaseSync {
      )
  `);

+  // Migrations for existing DBs (safe no-ops on fresh DBs that already have the column from CREATE TABLE).
+  ensureColumn(db, 'models', 'pub_date', 'TEXT');
+
  return db;
 }

@@ -157,15 +170,16 @@ function saveChanges(
 ): void {
  if (changes.new.length > 0) {
    const stmt = db.prepare(`
-        INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated,
+        INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated, pub_date,
                            context_window, max_completion_tokens, interfaces, description,
                            benchmark_elo, benchmark_mmlu, price_input, price_output, original_json, deleted_at)
-        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
        ON CONFLICT (id, vendor, service) DO UPDATE SET
            label                 = excluded.label,
            last_seen             = excluded.last_seen,
            created               = excluded.created,
            updated               = excluded.updated,
+            pub_date              = excluded.pub_date,
            context_window        = excluded.context_window,
            max_completion_tokens = excluded.max_completion_tokens,
            interfaces            = excluded.interfaces,
@@ -188,6 +202,7 @@ function saveChanges(
        timestamp,
        model.created ?? null,
        model.updated ?? null,
+        model.pubDate ?? null,
        model.contextWindow ?? null,
        model.maxCompletionTokens ?? null,
        model.interfaces ? JSON.stringify(model.interfaces) : null,
@@ -208,6 +223,7 @@ function saveChanges(
            last_seen             = ?,
            created               = ?,
            updated               = ?,
+            pub_date              = ?,
            context_window        = ?,
            max_completion_tokens = ?,
            interfaces            = ?,
@@ -229,6 +245,7 @@ function saveChanges(
        timestamp,
        model.created ?? null,
        model.updated ?? null,
+        model.pubDate ?? null,
        model.contextWindow ?? null,
        model.maxCompletionTokens ?? null,
        model.interfaces ? JSON.stringify(model.interfaces) : null,
@@ -247,11 +264,13 @@ function saveChanges(

  if (changes.unchanged.length > 0) {
    const stmt = db.prepare(`
-        INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated,
+        INSERT INTO models (id, vendor, service, label, first_seen, last_seen, created, updated, pub_date,
                            context_window, max_completion_tokens, interfaces, description,
                            benchmark_elo, benchmark_mmlu, price_input, price_output, original_json, deleted_at)
-        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
-        ON CONFLICT (id, vendor, service) DO UPDATE SET last_seen = excluded.last_seen
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL)
+        ON CONFLICT (id, vendor, service) DO UPDATE SET
+            last_seen = excluded.last_seen,
+            pub_date  = excluded.pub_date
    `);

    for (const model of changes.unchanged) {
@@ -264,6 +283,7 @@ function saveChanges(
        timestamp,
        model.created ?? null,
        model.updated ?? null,
+        model.pubDate ?? null,
        model.contextWindow ?? null,
        model.maxCompletionTokens ?? null,
        model.interfaces ? JSON.stringify(model.interfaces) : null,
@@ -310,6 +330,114 @@ function saveSyncHistory(
  );
 }

+// ============================================================================
+// Snapshot Export
+// ============================================================================
+
+interface CatalogModel {
+  id: string;
+  vendor: string;
+  service: string;
+  label: string;
+  pubDate: string | null;
+  firstSeen: string;
+  lastSeen: string;
+  deletedAt: string | null;
+  created: number | null;
+  updated: number | null;
+  contextWindow: number | null;
+  maxCompletionTokens: number | null;
+  interfaces: string[] | null;
+  description: string | null;
+  benchmarkElo: number | null;
+  priceInput: number | null;
+  priceOutput: number | null;
+}
+
+interface CatalogSnapshot {
+  schemaVersion: number;
+  exportedAt: string;
+  totalCount: number;
+  activeCount: number;
+  deletedCount: number;
+  byVendor: Record<string, number>;
+  models: CatalogModel[];
+}
+
+/** Dump the entire registry (active + soft-deleted) to a JSON file. Read-only on the DB. */
+function exportSnapshot(db: DatabaseSync, outPath: string): void {
+  const rows = db.prepare(`
+    SELECT id, vendor, service, label, pub_date, first_seen, last_seen, deleted_at,
+           created, updated, context_window, max_completion_tokens, interfaces, description,
+           benchmark_elo, price_input, price_output
+    FROM models
+    ORDER BY vendor, service, id
+  `).all() as unknown as Array<StoredModel & { interfaces: string | null }>;
+
+  const byVendor: Record<string, number> = {};
+  let activeCount = 0;
+  let deletedCount = 0;
+
+  const models: CatalogModel[] = rows.map((r) => {
+    byVendor[r.vendor] = (byVendor[r.vendor] || 0) + 1;
+    if (r.deleted_at) deletedCount++;
+    else activeCount++;
+
+    let parsedInterfaces: string[] | null = null;
+    if (r.interfaces) {
+      try {
+        const parsed = JSON.parse(r.interfaces);
+        if (Array.isArray(parsed)) parsedInterfaces = parsed;
+      } catch {
+        // leave null on parse failure
+      }
+    }
+
+    return {
+      id: r.id,
+      vendor: r.vendor,
+      service: r.service,
+      label: r.label,
+      pubDate: r.pub_date,
+      firstSeen: r.first_seen,
+      lastSeen: r.last_seen,
+      deletedAt: r.deleted_at,
+      created: r.created,
+      updated: r.updated,
+      contextWindow: r.context_window,
+      maxCompletionTokens: r.max_completion_tokens,
+      interfaces: parsedInterfaces,
+      description: r.description,
+      benchmarkElo: r.benchmark_elo,
+      priceInput: r.price_input,
+      priceOutput: r.price_output,
+    };
+  });
+
+  const snapshot: CatalogSnapshot = {
+    schemaVersion: 1,
+    exportedAt: new Date().toISOString(),
+    totalCount: rows.length,
+    activeCount,
+    deletedCount,
+    byVendor,
+    models,
+  };
+
+  // Write atomically: write to temp, then rename. Avoids partial reads if a consumer is watching.
+  const dir = path.dirname(path.resolve(outPath));
+  if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
+  const tmpPath = `${outPath}.tmp`;
+  fs.writeFileSync(tmpPath, JSON.stringify(snapshot, null, 2));
+  fs.renameSync(tmpPath, outPath);
+
+  console.log(
+    `${COLORS.green}✓ Exported${COLORS.reset} ${rows.length} models ` +
+    `(${activeCount} active, ${deletedCount} deleted) ` +
+    `${COLORS.dim}-> ${path.resolve(outPath)}${COLORS.reset}`,
+  );
+}
+
 // ============================================================================
 // Change Detection
 // ============================================================================
@@ -353,6 +481,9 @@ function detectChanges(
        existingModel.context_window !== (model.contextWindow ?? null) ||
        existingModel.max_completion_tokens !== (model.maxCompletionTokens ?? null) ||
        existingModel.interfaces !== modelInterfaces;
+      // NOTE: pub_date intentionally EXCLUDED from change detection. On first run after upgrade,
+      // all rows go from NULL -> editorial value, which would fire ~hundreds of spurious "updated"
+      // notifications. The unchanged-touch path below silently backfills pub_date instead.

      if (hasChanged) {
        changes.updated.push(model);
@@ -542,6 +673,10 @@ function parseArgs(): CliOptions {
      case '--validate':
        options.validate = true;
        break;
+      case '--export-db':
+        options.exportDbPath = nextArg;
+        i++;
+        break;
    }
  }

@@ -566,6 +701,7 @@ ${COLORS.bright}Options:${COLORS.reset}
  --posthog-key <key>         PostHog API key for analytics
  --discord-webhook <url>     Discord webhook URL
  --notify-filters <list>     Comma-separated vendor list (e.g., openai,anthropic)
+  --export-db <path>          Read-only DB dump to JSON (no API calls, no sync). Run separately from sync.
  --help                      Show this help

 ${COLORS.bright}Examples:${COLORS.reset}
@@ -961,6 +1097,17 @@ async function main() {
  try {
    const options = parseArgs();

+    // --export-db: read-only DB dump. No config, no sync, no API calls.
+    if (options.exportDbPath) {
+      const db = initDatabase();
+      try {
+        exportSnapshot(db, options.exportDbPath);
+      } finally {
+        db.close();
+      }
+      return;
+    }
+
    let servicesConfig: Record<string, AixAPI_Access>;

    if (options.config) {
Author	SHA1	Message	Date
Enrico Ros	55bde68a4d	Roll AIX	2026-05-05 04:17:39 -07:00
Enrico Ros	26ae3545a7	BlockOpUpstreamResume: full recovery. Fixes #1088	2026-05-05 04:14:00 -07:00
Enrico Ros	0001f7392b	AIX: Gemini Interactions: relax	2026-05-05 03:32:13 -07:00
Enrico Ros	d7e83e578b	BlockOpUpstreamResume: remove cancel - unused?	2026-05-05 03:25:27 -07:00
Enrico Ros	901d93b5f0	LLMs/AIX: Gemini: Agentic models: recovery mode (non-streaming). Fixes #1088	2026-05-05 03:23:35 -07:00
Enrico Ros	6858b0b94a	KB: LLMs: Gemini Interactions takeaways	2026-05-05 03:12:13 -07:00
Enrico Ros	9d88bf9b82	LLMs/AIX: Gemini: Agentic models: add option to disable visualizations. Fixes #1095	2026-05-05 03:06:30 -07:00
Enrico Ros	1bf1b744b9	llm-registry-sync: export models	2026-05-05 01:33:06 -07:00
Enrico Ros	ee2d7114c7	llm-registry-sync: record/sync pub date the next update won't have the spam (pub date not used for change detection)	2026-05-05 01:33:06 -07:00
Enrico Ros	3b1b54b3a3	KB: +llm-editorial	2026-05-05 01:33:06 -07:00
Enrico Ros	524029a882	Models List: show new (<30 days) models	2026-05-05 00:54:34 -07:00
Enrico Ros	69161d29a7	LLMs: Gemini typo	2026-05-05 00:29:13 -07:00
Enrico Ros	8a542c1af4	LLMs: display the pubDate	2026-05-05 00:16:01 -07:00
Enrico Ros	fe16970624	LLMs: PubDates	2026-05-05 00:01:06 -07:00
Enrico Ros	e21abdef45	LLMs: pubDate support	2026-05-04 13:48:29 -07:00
Enrico Ros	acdbb2fbaf	AIX: ContentReassembler: verbose post termination issues	2026-05-03 22:32:58 -07:00
Enrico Ros	14be134ef2	AIX: xAI: always request reasoning summaries. Fixes #1091	2026-05-03 14:40:48 -07:00
Enrico Ros	f56f6eb3cd	CLAUDE.md: branching hints	2026-05-03 14:27:59 -07:00
Enrico Ros	d3a7b75d1c	LLMs: Grok 4.3 support	2026-05-03 14:27:59 -07:00
Enrico Ros	d5d7cf5a21	ContentFragments: do not display for empty 'ma' summaries or text. #1091	2026-05-03 14:27:59 -07:00
Enrico Ros	13b928d68b	AIX: OpenAI Responses: non-fatal error if sealed OpenAI sometimes emits a trailing 'error' event (e.g. rate-limit/TPM advisory) AFTER 'response.completed'. The blanket error handler treated it as fatal, calling setDialectTerminatingIssue which: - injected a red [Openai Issue] fragment into the finished message - overrode the prior setDialectEnded('done-dialect') with 'issue-dialect' - flipped the AIX outcome to 'failed', turning the Beam ray red Track a #responseSealed flag set by the three terminal events (response.completed/failed/incomplete) and short-circuit trailing 'error' events with a server-log only - keeping mid-stream errors fatal as before.	2026-05-03 13:15:43 -07:00
Enrico Ros	31948a62f9	ChatDrawer: scroll active chat into view when filters clear	2026-05-03 13:15:43 -07:00
Enrico Ros	bf2d00a936	AppChat: filter by open beams	2026-05-03 13:15:43 -07:00
Enrico Ros	ed4edd7c0b	AIX: Anthropic: disable sticky execution continuity from simple prior container presence. #1087	2026-04-28 19:25:08 -07:00
Enrico Ros	e5de61d682	AIX: Anthropic: do not turn on code execution just for dynamic filtering. #1087	2026-04-28 18:24:00 -07:00
Enrico Ros	ac69c62020	Sort LLM Categories by names	2026-04-28 17:49:00 -07:00
Enrico Ros	a43b6a2cf5	AIX: Part xAI vs. OpenAI encrypted reasoning	2026-04-28 09:22:31 -07:00
Enrico Ros	e8e3366fe2	AIX: XAI: enable entrypted reasoning (if disabled breaks subsequent turns)	2026-04-27 18:05:28 -07:00
Enrico Ros	d813810a28	Anthropic: downgraded a throw to warn	2026-04-27 16:57:43 -07:00
Enrico Ros	c400aa7543	Chat: hide expires while pending in BlockOpUpstreamResume	2026-04-27 01:13:13 -07:00
Enrico Ros	9fc0b39730	AIX: Transmit token stop errors, if provided	2026-04-24 17:08:40 -07:00
Enrico Ros	194bfe23a1	AIX: OpenAI: mark the need for roundtrip of hosted tool pairs	2026-04-24 17:08:40 -07:00
Enrico Ros	35110480ef	Beam: Fix ghost columns. Fixes #1073	2026-04-24 16:04:29 -07:00
Enrico Ros	959595e33a	Merge: smaller copy update	2026-04-24 16:04:29 -07:00
Enrico Ros	a960424dfb	Merge: copy update. Fixes #1083	2026-04-24 15:56:13 -07:00
Enrico Ros	0df6c7d08b	Merge: copy. Fixes #1083	2026-04-24 15:48:56 -07:00
Enrico Ros	65c841e7a7	Roll AIX	2026-04-24 15:23:30 -07:00
Enrico Ros	b21b8cc982	AIX: Anthropic: show refusal details, if present, as inline text	2026-04-24 15:20:10 -07:00
Enrico Ros	aa2c4f06b7	AI Inspector: compress intermediate large string fields	2026-04-24 15:19:35 -07:00
Enrico Ros	b8d7b4ec10	AIX: OpenAI: fix svs on !ma for for NS	2026-04-24 15:19:35 -07:00
Enrico Ros	c48520255a	AIX: OpenAI: fix tool reparsing for NS	2026-04-24 15:19:34 -07:00
Enrico Ros	0790da989d	Don't truncate the Beam Title on Edit. Fix #1085 part 1.	2026-04-24 15:19:34 -07:00
Enrico Ros	506d24d2fd	AIX: OpenAI Response: fix reparse of tools	2026-04-24 15:19:34 -07:00
Enrico Ros	1348dbf493	AIX: update _upstreams	2026-04-24 15:19:33 -07:00
Enrico Ros	ce677f3cd9	LLMs: OpenAI: GPT 5.5	2026-04-24 15:19:33 -07:00
Enrico Ros	39203d78e3	LLMs: OpenAI: hide lots of older models, so by default the lastest are shown	2026-04-24 15:19:33 -07:00
Enrico Ros	2ef7daf369	LLMs: Gemini: hide 3.0 Pro (silently remapped to 3.1 by Gemini). Fixes #1082	2026-04-24 15:19:33 -07:00
Enrico Ros	cff3d90613	AIX: DeepSeek V4: fix function calling	2026-04-24 05:45:53 -07:00
Enrico Ros	9f89243d7f	AIX: DeepSeek V4: fix swalling of tool parts	2026-04-24 05:45:53 -07:00
Enrico Ros	784ee9a4da	AIX: DeepSeek V4: wires and parser NS	2026-04-24 05:45:53 -07:00
Enrico Ros	678e6b8ba1	AIX: Gemini Interactions: terminate on error	2026-04-24 05:45:53 -07:00
Enrico Ros	30e301c496	BlockOpUpstreamResume: Stop/Cancel	2026-04-24 03:59:50 -07:00
Enrico Ros	b22904f6bb	AIX: Gemini Interactions: Cancel + Delete Also see: googleapis/python-genai#1971	2026-04-24 03:40:34 -07:00
Enrico Ros	3f0de7ddca	CH: Auto-Title beam chats when done. Fixes #1078	2026-04-24 03:32:04 -07:00
Enrico Ros	9a6f0f9202	AppChat: never re-open an opened beam. Fixes #1079	2026-04-24 03:24:56 -07:00
Enrico Ros	4f0bae5657	AppChat: do not re-beam or regenerate while beam is open. Fixes #1079	2026-04-24 03:19:17 -07:00
Enrico Ros	2101f06195	Roll AIX	2026-04-24 03:04:09 -07:00
Enrico Ros	6d54b5594c	Autotitle: Use natural capitalization. Fixes #1077	2026-04-24 02:48:28 -07:00
Enrico Ros	36b8e5b1df	Chat: show Stop/Cancel on streaming upstream runs	2026-04-24 02:47:17 -07:00
Enrico Ros	8252d671c7	LLMs: Gemini: Deep Research models support images	2026-04-24 02:47:13 -07:00
Enrico Ros	30d97c94aa	LLMs: DeepSeek: bits (note: vision is still not available)	2026-04-24 02:47:13 -07:00
Enrico Ros	82654a00d4	AIX: Streaming (hinting) review and Gemini Interactions API fix	2026-04-24 02:47:09 -07:00
Enrico Ros	9595f14ddc	LLM: DeepSeek V4 (flash, pro) + thinking/reasoning_effort fix	2026-04-23 23:59:09 -07:00
Enrico Ros	8c496074b2	LLMs: DeepSeek: add V4 models	2026-04-23 23:30:41 -07:00
Enrico Ros	4d097d7136	LLMs: DeepSeek: add V4 support infra	2026-04-23 23:30:34 -07:00
Enrico Ros	178619d275	AI Settings: match the defaults description. Fixes #1076	2026-04-23 23:29:20 -07:00
Enrico Ros	59c8b2538d	Merge pull request #1074 from tredondo/patch-1 chore: fix Zod 4 type-strictness issue (#1072)	2026-04-23 22:57:01 -07:00
Enrico Ros	443b72c52a	AIX: OpenAI Responses: fix Zod 4 build error in tools .catch() Bare `return;` produced `void`, which Zod 4 rejects for a `.catch()` on `z.array(...).optional()` expecting `Tool[] \| undefined`. Return `undefined` explicitly, matching the existing pattern at line 1204. Fixes #1072	2026-04-23 22:56:19 -07:00
Enrico Ros	ae13abef45	Nobody can tell @fredliubojin what to resume	2026-04-23 22:22:16 -07:00
Ted Robertson	83ae02ef9b	chore: fix Zod 4 type-strictness issue (#1072 )	2026-04-23 19:51:49 -07:00