feat: Implement TTS vendor abstraction system

Adds support for multiple TTS providers (OpenAI, ElevenLabs) with vendor abstraction pattern similar to LLM vendors. Core changes: - Created /src/modules/tts/ module with vendor abstraction - Implemented ITTSVendor interface for unified TTS API - Added vendor implementations for ElevenLabs and OpenAI TTS - Created store-tts.ts for service and voice configuration - Implemented unified tts.client.ts for vendor-agnostic speech - Added OpenAI TTS tRPC router with streaming support - Updated PersonaChatMessageSpeak to use new TTS client - Added migration logic for existing ElevenLabs configs - Updated data.ts to support new voice configuration format Technical details: - Service-scoped pattern: activeServiceId + activeVoiceId - Backward compatible with existing elevenLabs voice configs - Auto-import capability from LLM configurations - Supports streaming and non-streaming TTS - Vendor-specific features handled gracefully Relates to #858 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
LLMs: Ollama: Update models
2026-05-10 21:50:14 -07:00 · 2025-10-29 13:31:32 +00:00 · 2025-10-28 16:36:43 -07:00 · 2025-10-28 16:34:24 -07:00 · 2025-10-28 16:31:53 -07:00 · 2025-10-28 16:30:48 -07:00
29 changed files with 1161 additions and 314 deletions
@@ -29,6 +29,7 @@ The parser outputs: `modelName|pulls|capabilities|sizes`

 **Important:**
 - Skip models below 50,000 pulls (parser does this automatically)
+- Skip embedding models (parser does not do this automatically)
 - Sort them in the EXACT same order as the source (featured models)
 - Extract tags: 'tools' → hasTools, 'vision' → hasVision, 'embedding' → isEmbeddings (note the 's'), 'thinking' → tags only
 - Extract 'b' tags (1.5b, 7b, 32b) to tags field
@@ -3,6 +3,7 @@
    "allow": [
      "Bash(cat:*)",
      "Bash(cp:*)",
+      "Bash(curl:*)",
      "Bash(find:*)",
      "Bash(git branch:*)",
      "Bash(git describe:*)",
@@ -20,6 +21,7 @@
      "Bash(rg:*)",
      "Bash(rm:*)",
      "Bash(sed:*)",
+      "Read(//tmp/**)"
      "WebFetch",
      "WebFetch(domain:big-agi.com)",
      "WebSearch",
@@ -19,7 +19,7 @@ jobs:
      (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude'))

    runs-on: ubuntu-latest
-    timeout-minutes: 20
+    timeout-minutes: 30

    permissions:
      contents: read
@@ -12,7 +12,7 @@ jobs:
      !contains(github.event.issue.body, '@claude')

    runs-on: ubuntu-latest
-    timeout-minutes: 20
+    timeout-minutes: 30

    permissions:
      contents: read
@@ -23,6 +23,7 @@ env:
 jobs:
  build-and-push-image:
    runs-on: ubuntu-latest
+    timeout-minutes: 60  # Max 1 hour (expected: ~25min)
    permissions:
      contents: read
      packages: write
@@ -60,7 +60,7 @@ Shows only parameters that are:

 The AIX client transforms DLLM parameters to wire protocol format. This layer handles parameter precedence rules and name transformations:

-```typescript
+```
 // Parameter precedence: newer 4-value version takes priority over 3-value
 ...((llmVndOaiReasoningEffort4 || llmVndOaiReasoningEffort) ?
  { vndOaiReasoningEffort: llmVndOaiReasoningEffort4 || llmVndOaiReasoningEffort } : {})
@@ -1,4 +1,4 @@
-import { elevenLabsSpeakText } from '~/modules/elevenlabs/elevenlabs.client';
+import { speakText } from '~/modules/tts/tts.client';

 import { isTextContentFragment } from '~/common/stores/chat/chat.fragments';

@@ -59,6 +59,6 @@ export class PersonaChatMessageSpeak implements PersonaProcessorInterface {
    console.log('📢 TTS:', text);
    this.spokenLine = true;
    // fire/forget: we don't want to stall this loop
-    void elevenLabsSpeakText(text, undefined, false, true);
+    void speakText(text, { streaming: false, turbo: true });
  }
 }
@@ -14,7 +14,11 @@ export type SystemPurposeData = {
  examples?: SystemPurposeExample[];
  highlighted?: boolean;
  call?: { starters?: string[] };
-  voices?: { elevenLabs?: { voiceId: string } };
+  voices?: {
+    tts?: { voiceId?: string };
+    // Legacy support for existing configs
+    elevenLabs?: { voiceId: string };
+  };
 };

 export type SystemPurposeExample = string | { prompt: string, action?: 'require-data-attachment' };
@@ -141,7 +141,8 @@ export function createFastEventSourceDemuxer(): AixDemuxers.StreamDemuxer {
        // if the line starts with a colon, ignore
        const colonIndex = line.indexOf(':');
        if (colonIndex === 0) {
-          if (AIX_SECURITY_ONLY_IN_DEV_BUILDS)
+          // [OpenRouter, 2025-10-28] sends many processing strings that we may ignore here
+          if (AIX_SECURITY_ONLY_IN_DEV_BUILDS && line !== ': OPENROUTER PROCESSING')
            console.log('[DEV] fast-sse-demuxer: SSE Comment (may ignore):', line.slice(line.startsWith(': ') ? 2 : 1));
          continue;
        }
@@ -94,7 +94,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    description: 'Best model for complex agents and coding, with the highest intelligence across most tasks',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
-    trainingDataCutoff: 'Jul 2025',
+    trainingDataCutoff: 'Jan 2025',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_ANT_PromptCaching],
    parameterSpecs: [...ANT_PAR_WEB, { paramId: 'llmVndAnt1MContext' }, { paramId: 'llmVndAntSkills' }],
    // Note: Tiered pricing - ≤200K: $3/$15, >200K: $6/$22.50 (with 1M context enabled)
@@ -117,7 +117,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    description: 'Fastest model with exceptional speed and performance',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
-    trainingDataCutoff: 'Jul 2025',
+    trainingDataCutoff: 'Feb 2025',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_ANT_PromptCaching],
    parameterSpecs: [...ANT_PAR_WEB, { paramId: 'llmVndAntSkills' }],
    chatPrice: { input: 1, output: 5, cache: { cType: 'ant-bp', read: 0.10, write: 1.25, duration: 300 } },
@@ -130,7 +130,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    description: 'Exceptional model for specialized complex tasks requiring advanced reasoning',
    contextWindow: 200000,
    maxCompletionTokens: 32000,
-    trainingDataCutoff: 'Mar 2025',
+    trainingDataCutoff: 'Jan 2025',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_ANT_PromptCaching],
    parameterSpecs: ANT_PAR_WEB,
    chatPrice: { input: 15, output: 75, cache: { cType: 'ant-bp', read: 1.50, write: 18.75, duration: 300 } },
@@ -177,9 +177,9 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo

  // Claude 3.7 models
  {
-    id: 'claude-3-7-sonnet-20250219', // Active | Guaranteed Until: February 2026
-    label: 'Claude Sonnet 3.7',
-    description: 'High-performance model with early extended thinking',
+    id: 'claude-3-7-sonnet-20250219', // Deprecated | Deprecated: October 28, 2025 | Retiring: February 19, 2026
+    label: 'Claude Sonnet 3.7 [Deprecated]',
+    description: 'High-performance model with early extended thinking. Deprecated October 28, 2025, retiring February 19, 2026.',
    contextWindow: 200000,
    maxCompletionTokens: 64000,
    trainingDataCutoff: 'Nov 2024',
@@ -187,35 +187,13 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    parameterSpecs: ANT_PAR_WEB,
    chatPrice: { input: 3, output: 15, cache: { cType: 'ant-bp', read: 0.30, write: 3.75, duration: 300 } },
    benchmark: { cbaElo: 1369 }, // claude-3-7-sonnet-20250219
-  },
-
-  // Claude 3.5 models
-  {
-    id: 'claude-3-5-sonnet-20241022', // Deprecated | Deprecated: August 13, 2025 | Retiring: October 22, 2025
-    label: 'Claude Sonnet 3.5 [Deprecated]',
-    description: 'High level of intelligence and capability. Deprecated August 13, 2025, retiring October 22, 2025.',
-    contextWindow: 200000,
-    maxCompletionTokens: 8192,
-    trainingDataCutoff: 'Jul 2024',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_ANT_PromptCaching],
-    chatPrice: { input: 3, output: 15, cache: { cType: 'ant-bp', read: 0.30, write: 3.75, duration: 300 } },
-    benchmark: { cbaElo: 1368, cbaMmlu: 88.7 }, // Claude 3.5 Sonnet (10/22)
    hidden: true, // deprecated
    isLegacy: true,
  },
-  {
-    id: 'claude-3-5-sonnet-20240620', // Deprecated | Deprecated: August 13, 2025 | Retiring: October 22, 2025
-    label: 'Claude Sonnet 3.5 (previous) [Deprecated]',
-    description: 'Previous version of Claude Sonnet 3.5. Deprecated August 13, 2025, retiring October 22, 2025.',
-    contextWindow: 200000,
-    maxCompletionTokens: 8192,
-    trainingDataCutoff: 'Apr 2024',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_ANT_PromptCaching],
-    chatPrice: { input: 3, output: 15, cache: { cType: 'ant-bp', read: 0.30, write: 3.75, duration: 300 } },
-    benchmark: { cbaElo: 1340, cbaMmlu: 88.6 },
-    hidden: true,
-    isLegacy: true,
-  },
+
+  // Claude 3.5 models
+  // retired: 'claude-3-5-sonnet-20241022'
+  // retired: 'claude-3-5-sonnet-20240620'
  {
    id: 'claude-3-5-haiku-20241022', // Active | Guaranteed Until: October 2025
    label: 'Claude Haiku 3.5',
@@ -244,6 +222,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
    isLegacy: true,
  },
  {
+    hidden: true, // yield to successors
    id: 'claude-3-haiku-20240307', // Active
    label: 'Claude Haiku 3',
    description: 'Fast and compact model for near-instant responsiveness',
@@ -256,43 +235,7 @@ export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: bo
  },

  // Legacy/Retired models
-  {
-    id: 'claude-3-sonnet-20240229', // Retired | Retired: July 21, 2025
-    label: 'Claude Sonnet 3 [Retired]',
-    description: 'Balance of intelligence and speed. Retired July 21, 2025.',
-    contextWindow: 200000,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Aug 2023',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
-    chatPrice: { input: 3, output: 15 },
-    benchmark: { cbaElo: 1283, cbaMmlu: 79 },
-    hidden: true,
-    isLegacy: true,
-  },
-  {
-    id: 'claude-2.1', // Retired | Retired: July 21, 2025
-    label: 'Claude 2.1 [Retired]',
-    description: 'Updated version of Claude 2 with improved accuracy. Retired July 21, 2025.',
-    contextWindow: 200000,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Early 2023',
-    interfaces: [LLM_IF_OAI_Chat],
-    chatPrice: { input: 8, output: 24 },
-    benchmark: { cbaElo: 1118 },
-    hidden: true,
-    isLegacy: true,
-  },
-  {
-    id: 'claude-2.0', // Retired | Retired: July 21, 2025
-    label: 'Claude 2 [Retired]',
-    description: 'Predecessor to Claude 3, offering strong all-round performance. Retired July 21, 2025.',
-    contextWindow: 100000,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Early 2023',
-    interfaces: [LLM_IF_OAI_Chat],
-    chatPrice: { input: 8, output: 24 },
-    benchmark: { cbaElo: 1132, cbaMmlu: 78.5 },
-    hidden: true,
-    isLegacy: true,
-  },
+  // retired: 'claude-3-sonnet-20240229'
+  // retired: 'claude-2.1'
+  // retired: 'claude-2.0'
 ];
@@ -6,6 +6,7 @@ import { env } from '~/server/env';
 import { fetchJsonOrTRPCThrow } from '~/server/trpc/trpc.router.fetchers';

 import { LLM_IF_ANT_PromptCaching, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision, LLM_IF_Tools_WebSearch } from '~/common/stores/llms/llms.types';
+import { Release } from '~/common/app.release';

 import { ListModelsResponse_schema, ModelDescriptionSchema } from '../llm.server.types';

@@ -17,6 +18,8 @@ import { fixupHost } from '~/modules/llms/server/openai/openai.router';
 const DEFAULT_ANTHROPIC_HOST = 'api.anthropic.com';
 const DEFAULT_HELICONE_ANTHROPIC_HOST = 'anthropic.hconeai.com';

+const DEV_DEBUG_ANTHROPIC_MODELS = Release.IsNodeDevBuild;
+
 const DEFAULT_ANTHROPIC_HEADERS = {
  // Latest version hasn't changed (as of Feb 2025)
  'anthropic-version': '2023-06-01',
@@ -65,18 +68,6 @@ const PER_MODEL_BETA_FEATURES: { [modelId: string]: string[] } = {
    'computer-use-2025-01-24',

  ] as const,
-  'claude-3-5-sonnet-20241022': [
-
-    /** computer Tools for Sonnet 3.5 v2 [computer_20241022, text_editor_20241022, bash_20241022] */
-    'computer-use-2024-10-22',
-
-  ] as const,
-  'claude-3-5-sonnet-20240620': [
-
-    /** to use the 8192 tokens limit for the FIRST 3.5 Sonnet model */
-    'max-tokens-3-5-sonnet-2024-07-15',
-
-  ] as const,
 } as const;

 type AnthropicHeaderOptions = {
@@ -271,6 +262,7 @@ export const llmAnthropicRouter = createTRPCRouter({

            // for day-0 support of new models, create a placeholder model using sensible defaults
            const novelModel = _createPlaceholderModel(model);
+            // if (DEV_DEBUG_ANTHROPIC_MODELS) // kind of important...
            console.log('[DEV] anthropic.router: new model found, please configure it:', novelModel.id);
            acc.push(novelModel);

@@ -281,10 +273,13 @@ export const llmAnthropicRouter = createTRPCRouter({
        .map(_injectWebSearchInterface);

      // developers warning for obsoleted models (we have them, but they are not in the API response anymore)
-      const apiModelIds = new Set(availableModels.map(m => m.id));
-      const additionalModels = hardcodedAnthropicModels.filter(m => !apiModelIds.has(m.id));
-      if (additionalModels.length > 0)
-        console.log('[DEV] anthropic.router: obsoleted models:', additionalModels.map(m => m.id).join(', '));
+      if (DEV_DEBUG_ANTHROPIC_MODELS) {
+        const apiModelIds = new Set(availableModels.map(m => m.id));
+        const additionalModels = hardcodedAnthropicModels.filter(m => !apiModelIds.has(m.id));
+        if (additionalModels.length > 0)
+          console.log('[DEV] anthropic.router: obsoleted models:', additionalModels.map(m => m.id).join(', '));
+      }
+
      // additionalModels.forEach(m => {
      //   m.label += ' (Removed)';
      //   m.isLegacy = true;
@@ -12,157 +12,158 @@
 >>>
 */
 export const OLLAMA_BASE_MODELS: { [key: string]: { pulls: number, contextWindow?: number, hasTools?: true, hasVision?: true, isEmbeddings?: true, tags?: string[], added?: string } } = {
-  'gpt-oss': { pulls: 3500000, tags: ['thinking', 'cloud', '20b', '120b'], hasTools: true, added: '20251015' },
-  'deepseek-r1': { pulls: 66200000, tags: ['thinking', '1.5b', '7b', '8b', '14b', '32b', '70b', '671b'], hasTools: true, added: '20250128' },
-  'gemma3': { pulls: 21000000, tags: ['270m', '1b', '4b', '12b', '27b'], hasVision: true, added: '20251015' },
-  'qwen3': { pulls: 10100000, tags: ['thinking', '0.6b', '1.7b', '4b', '8b', '14b', '30b', '32b', '235b'], hasTools: true, added: '20251015' },
-  'deepseek-v3.1': { pulls: 100400, tags: ['thinking', 'cloud', '671b'], hasTools: true, added: '20251015' },
-  'llama3.1': { pulls: 104300000, tags: ['8b', '70b', '405b'], hasTools: true, added: '20241210' },
-  'llama3.2': { pulls: 40200000, tags: ['1b', '3b'], hasTools: true, added: '20241210' },
-  'mistral': { pulls: 20800000, tags: ['7b'], hasTools: true },
-  'qwen2.5': { pulls: 15100000, tags: ['0.5b', '1.5b', '3b', '7b', '14b', '32b', '72b'], hasTools: true, added: '20241210' },
-  'phi3': { pulls: 11600000, tags: ['3.8b', '14b'], added: '20240501' },
-  'llama3': { pulls: 11300000, tags: ['8b', '70b'], added: '20240501' },
-  'llava': { pulls: 10500000, tags: ['7b', '13b', '34b'], hasVision: true },
-  'gemma2': { pulls: 8199999, tags: ['2b', '9b', '27b'], added: '20240628' },
-  'qwen2.5-coder': { pulls: 7600000, tags: ['0.5b', '1.5b', '3b', '7b', '14b', '32b'], hasTools: true, added: '20241210' },
-  'phi4': { pulls: 5500000, tags: ['14b'], added: '20250128' },
+  'gpt-oss': { pulls: 3800000, tags: ['thinking', '20b', '120b'], hasTools: true, added: '20251015' },
+  'deepseek-r1': { pulls: 68300000, tags: ['thinking', '1.5b', '7b', '8b', '14b', '32b', '70b', '671b'], hasTools: true, added: '20250128' },
+  'qwen3-coder': { pulls: 571000, tags: ['30b', '480b'], hasTools: true, added: '20251015' },
+  'gemma3': { pulls: 23000000, tags: ['270m', '1b', '4b', '12b', '27b'], hasVision: true, added: '20251015' },
+  'qwen3': { pulls: 11600000, tags: ['thinking', '0.6b', '1.7b', '4b', '8b', '14b', '30b', '32b', '235b'], hasTools: true, added: '20251015' },
+  'deepseek-v3.1': { pulls: 119200, tags: ['thinking', '671b'], hasTools: true, added: '20251015' },
+  'llama3.1': { pulls: 105000000, tags: ['8b', '70b', '405b'], hasTools: true, added: '20241210' },
+  'llama3.2': { pulls: 42400000, tags: ['1b', '3b'], hasTools: true, added: '20241210' },
+  'mistral': { pulls: 21300000, tags: ['7b'], hasTools: true },
+  'qwen2.5': { pulls: 15800000, tags: ['0.5b', '1.5b', '3b', '7b', '14b', '32b', '72b'], hasTools: true, added: '20241210' },
+  'phi3': { pulls: 12500000, tags: ['3.8b', '14b'], added: '20240501' },
+  'llama3': { pulls: 11500000, tags: ['8b', '70b'], added: '20240501' },
+  'llava': { pulls: 11000000, tags: ['7b', '13b', '34b'], hasVision: true },
+  'gemma2': { pulls: 8600000, tags: ['2b', '9b', '27b'], added: '20240628' },
+  'qwen2.5-coder': { pulls: 7900000, tags: ['0.5b', '1.5b', '3b', '7b', '14b', '32b'], hasTools: true, added: '20241210' },
+  'phi4': { pulls: 5800000, tags: ['14b'], added: '20250128' },
  'gemma': { pulls: 5400000, tags: ['2b', '7b'], added: '20240501' },
  'qwen': { pulls: 5000000, tags: ['0.5b', '1.8b', '4b', '7b', '14b', '32b', '72b', '110b'] },
  'qwen2': { pulls: 4400000, tags: ['0.5b', '1.5b', '7b', '72b'], hasTools: true, added: '20240628' },
-  'llama2': { pulls: 4300000, tags: ['7b', '13b', '70b'] },
-  'minicpm-v': { pulls: 3700000, tags: ['8b'], hasVision: true, added: '20241210' },
-  'dolphin3': { pulls: 3200000, tags: ['8b'], added: '20250128' },
-  'codellama': { pulls: 3200000, tags: ['7b', '13b', '34b', '70b'] },
-  'olmo2': { pulls: 3100000, tags: ['7b', '13b'], added: '20250312' },
+  'llama2': { pulls: 4400000, tags: ['7b', '13b', '70b'] },
+  'minicpm-v': { pulls: 3800000, tags: ['8b'], hasVision: true, added: '20241210' },
+  'dolphin3': { pulls: 3400000, tags: ['8b'], added: '20250128' },
+  'codellama': { pulls: 3300000, tags: ['7b', '13b', '34b', '70b'] },
+  'olmo2': { pulls: 3200000, tags: ['7b', '13b'], added: '20250312' },
  'tinyllama': { pulls: 3000000, tags: ['1.1b'] },
-  'mistral-nemo': { pulls: 2700000, tags: ['12b'], hasTools: true, added: '20241210' },
-  'llama3.2-vision': { pulls: 2700000, tags: ['11b', '90b'], hasVision: true, added: '20241210' },
+  'llama3.2-vision': { pulls: 2800000, tags: ['11b', '90b'], hasVision: true, added: '20241210' },
+  'mistral-nemo': { pulls: 2800000, tags: ['12b'], hasTools: true, added: '20241210' },
  'llama3.3': { pulls: 2600000, tags: ['70b'], hasTools: true, added: '20241210' },
-  'deepseek-v3': { pulls: 2500000, tags: ['671b'], added: '20250128' },
-  'mistral-small': { pulls: 2000000, tags: ['22b', '24b'], hasTools: true, added: '20250219' },
-  'smollm2': { pulls: 1900000, tags: ['135m', '360m', '1.7b'], hasTools: true, added: '20241210' },
-  'llava-llama3': { pulls: 1900000, tags: ['8b'], hasVision: true, added: '20240628' },
+  'deepseek-v3': { pulls: 2600000, tags: ['671b'], added: '20250128' },
+  'mistral-small': { pulls: 2100000, tags: ['22b', '24b'], hasTools: true, added: '20250219' },
+  'smollm2': { pulls: 2000000, tags: ['135m', '360m', '1.7b'], hasTools: true, added: '20241210' },
+  'llava-llama3': { pulls: 2000000, tags: ['8b'], hasVision: true, added: '20240628' },
  'qwq': { pulls: 1700000, tags: ['32b'], hasTools: true, added: '20250312' },
-  'deepseek-coder': { pulls: 1500000, tags: ['1.3b', '6.7b', '33b'] },
-  'mixtral': { pulls: 1400000, tags: ['8x7b', '8x22b'], hasTools: true, added: '20250312' },
+  'deepseek-coder': { pulls: 1600000, tags: ['1.3b', '6.7b', '33b'] },
  'starcoder2': { pulls: 1400000, tags: ['3b', '7b', '15b'], added: '20240501' },
+  'mixtral': { pulls: 1400000, tags: ['8x7b', '8x22b'], hasTools: true, added: '20250312' },
  'llama2-uncensored': { pulls: 1300000, tags: ['7b', '70b'] },
-  'codegemma': { pulls: 1200000, tags: ['2b', '7b'], added: '20240501' },
-  'deepseek-coder-v2': { pulls: 1100000, tags: ['16b', '236b'], added: '20240628' },
-  'falcon3': { pulls: 899600, tags: ['1b', '3b', '7b', '10b'], added: '20241210' },
-  'granite3.1-moe': { pulls: 864300, tags: ['1b', '3b'], hasTools: true, added: '20250128' },
-  'orca-mini': { pulls: 853700, tags: ['3b', '7b', '13b', '70b'] },
-  'qwen2.5vl': { pulls: 799500, tags: ['3b', '7b', '32b', '72b'], hasVision: true, added: '20251015' },
-  'llama4': { pulls: 736800, tags: ['16x17b', '128x17b'], hasVision: true, hasTools: true, added: '20251015' },
-  'phi': { pulls: 722400, tags: ['2.7b'] },
-  'dolphin-mixtral': { pulls: 710400, tags: ['8x7b', '8x22b'], added: '20250312' },
-  'mistral-small3.2': { pulls: 694500, tags: ['24b'], hasVision: true, hasTools: true, added: '20251015' },
-  'granite3.3': { pulls: 620800, tags: ['2b', '8b'], hasTools: true, added: '20251015' },
-  'openthinker': { pulls: 602700, tags: ['7b', '32b'], added: '20250219' },
-  'cogito': { pulls: 586300, tags: ['3b', '8b', '14b', '32b', '70b'], hasTools: true, added: '20251015' },
-  'gemma3n': { pulls: 582100, tags: ['e2b', 'e4b'], added: '20251015' },
-  'phi4-reasoning': { pulls: 564600, tags: ['14b'], added: '20251015' },
-  'magistral': { pulls: 503900, tags: ['thinking', '24b'], hasTools: true, added: '20251015' },
-  'deepscaler': { pulls: 500000, tags: ['1.5b'], added: '20250219' },
-  'dolphin-phi': { pulls: 495400, tags: ['2.7b'] },
-  'qwen3-coder': { pulls: 495100, tags: ['30b', '480b'], hasTools: true, added: '20251015' },
-  'dolphin-llama3': { pulls: 488000, tags: ['8b', '70b'], added: '20240501' },
-  'codestral': { pulls: 483500, tags: ['22b'], added: '20240628' },
-  'smollm': { pulls: 473900, tags: ['135m', '360m', '1.7b'], added: '20241210' },
-  'wizardlm2': { pulls: 455800, tags: ['7b', '8x22b'], added: '20240501' },
-  'phi4-mini': { pulls: 432300, tags: ['3.8b'], hasTools: true, added: '20250312' },
-  'dolphin-mistral': { pulls: 404500, tags: ['7b'] },
-  'devstral': { pulls: 399900, tags: ['24b'], hasTools: true, added: '20251015' },
-  'granite3.2-vision': { pulls: 377400, tags: ['2b'], hasVision: true, hasTools: true, added: '20250312' },
-  'command-r': { pulls: 355900, tags: ['35b'], hasTools: true, added: '20240501' },
-  'hermes3': { pulls: 341000, tags: ['3b', '8b', '70b', '405b'], hasTools: true, added: '20241210' },
-  'phi3.5': { pulls: 334900, tags: ['3.8b'], added: '20241210' },
-  'deepcoder': { pulls: 333200, tags: ['1.5b', '14b'], added: '20251015' },
-  'mistral-small3.1': { pulls: 319300, tags: ['24b'], hasVision: true, hasTools: true, added: '20251015' },
-  'yi': { pulls: 305700, tags: ['6b', '9b', '34b'] },
-  'zephyr': { pulls: 292400, tags: ['7b', '141b'] },
-  'moondream': { pulls: 274400, tags: ['1.8b'], hasVision: true, added: '20240501' },
-  'granite-code': { pulls: 272500, tags: ['3b', '8b', '20b', '34b'], added: '20240628' },
-  'mistral-large': { pulls: 262900, tags: ['123b'], hasTools: true, added: '20241210' },
-  'wizard-vicuna-uncensored': { pulls: 251300, tags: ['7b', '13b', '30b'] },
-  'starcoder': { pulls: 227100, tags: ['1b', '3b', '7b', '15b'] },
-  'deepseek-llm': { pulls: 209300, tags: ['7b', '67b'] },
-  'nous-hermes': { pulls: 208000, tags: ['7b', '13b'] },
-  'exaone-deep': { pulls: 207300, tags: ['2.4b', '7.8b', '32b'], added: '20251015' },
-  'vicuna': { pulls: 204800, tags: ['7b', '13b', '33b'] },
-  'openchat': { pulls: 202100, tags: ['7b'] },
-  'falcon': { pulls: 197500, tags: ['7b', '40b', '180b'] },
-  'deepseek-v2': { pulls: 192600, tags: ['16b', '236b'], added: '20240628' },
-  'mistral-openorca': { pulls: 188800, tags: ['7b'] },
-  'codegeex4': { pulls: 187800, tags: ['9b'], added: '20241210' },
-  'openhermes': { pulls: 187300, tags: [] },
-  'codeqwen': { pulls: 179700, tags: ['7b'], added: '20240501' },
-  'opencoder': { pulls: 177900, tags: ['1.5b', '8b'], added: '20241210' },
-  'qwen2-math': { pulls: 174100, tags: ['1.5b', '7b', '72b'], added: '20241210' },
-  'llama2-chinese': { pulls: 167900, tags: ['7b', '13b'] },
-  'aya': { pulls: 165700, tags: ['8b', '35b'], added: '20240628' },
-  'tinydolphin': { pulls: 163400, tags: ['1.1b'] },
-  'glm4': { pulls: 160100, tags: ['9b'], added: '20241210' },
-  'granite3.2': { pulls: 159400, tags: ['2b', '8b'], hasTools: true, added: '20250312' },
-  'stable-code': { pulls: 156600, tags: ['3b'] },
-  'nous-hermes2': { pulls: 151400, tags: ['10.7b', '34b'] },
-  'neural-chat': { pulls: 148900, tags: ['7b'] },
-  'wizardcoder': { pulls: 147300, tags: ['33b'] },
-  'command-r-plus': { pulls: 144600, contextWindow: 128000, tags: ['104b'], hasTools: true, added: '20240501' },
-  'bakllava': { pulls: 142900, tags: ['7b'], hasVision: true },
-  'sqlcoder': { pulls: 138300, tags: ['7b', '15b'] },
-  'stablelm2': { pulls: 133800, tags: ['1.6b', '12b'] },
-  'yi-coder': { pulls: 133100, tags: ['1.5b', '9b'], added: '20241210' },
-  'llama3-chatqa': { pulls: 129000, tags: ['8b', '70b'], added: '20240628' },
-  'llava-phi3': { pulls: 126800, tags: ['3.8b'], hasVision: true, added: '20240628' },
-  'granite3-dense': { pulls: 126100, tags: ['2b', '8b'], hasTools: true, added: '20241210' },
-  'granite3.1-dense': { pulls: 122300, tags: ['2b', '8b'], hasTools: true, added: '20250128' },
-  'wizard-math': { pulls: 121100, tags: ['7b', '13b', '70b'] },
-  'exaone3.5': { pulls: 119500, tags: ['2.4b', '7.8b', '32b'], added: '20241210' },
-  'reflection': { pulls: 119400, tags: ['70b'], added: '20241210' },
-  'llama3-gradient': { pulls: 117700, tags: ['8b', '70b'], added: '20240501' },
-  'r1-1776': { pulls: 117600, tags: ['70b', '671b'], added: '20250312' },
-  'dbrx': { pulls: 111600, tags: ['132b'], added: '20241210' },
-  'dolphincoder': { pulls: 111300, tags: ['7b', '15b'], added: '20240501' },
-  'samantha-mistral': { pulls: 110100, tags: ['7b'] },
-  'nemotron-mini': { pulls: 108300, tags: ['4b'], hasTools: true, added: '20241210' },
-  'tulu3': { pulls: 106700, tags: ['8b', '70b'], added: '20241210' },
-  'starling-lm': { pulls: 103700, tags: ['7b'] },
-  'phind-codellama': { pulls: 102100, tags: ['34b'] },
-  'internlm2': { pulls: 102000, tags: ['1m', '1.8b', '7b', '20b'], added: '20241210' },
-  'solar': { pulls: 101500, tags: ['10.7b'] },
-  'xwinlm': { pulls: 101000, tags: ['7b', '13b'] },
-  'athene-v2': { pulls: 99500, tags: ['72b'], hasTools: true, added: '20251015' },
-  'llama3-groq-tool-use': { pulls: 97900, tags: ['8b', '70b'], hasTools: true, added: '20251015' },
-  'nemotron': { pulls: 96400, tags: ['70b'], hasTools: true, added: '20251015' },
-  'yarn-llama2': { pulls: 94400, tags: ['7b', '13b'], added: '20251015' },
-  'meditron': { pulls: 92700, tags: ['7b', '70b'], added: '20251015' },
-  'granite3-moe': { pulls: 88400, tags: ['1b', '3b'], hasTools: true, added: '20251015' },
-  'wizardlm-uncensored': { pulls: 87800, tags: ['13b'], added: '20251015' },
-  'llama-guard3': { pulls: 87200, tags: ['1b', '8b'], added: '20251015' },
-  'aya-expanse': { pulls: 86600, tags: ['8b', '32b'], hasTools: true, added: '20251015' },
-  'smallthinker': { pulls: 84000, tags: ['3b'], added: '20251015' },
-  'orca2': { pulls: 80700, tags: ['7b', '13b'], added: '20251015' },
-  'wizardlm': { pulls: 80000, tags: [], added: '20251015' },
-  'medllama2': { pulls: 78100, tags: ['7b'], added: '20251015' },
-  'nous-hermes2-mixtral': { pulls: 75100, tags: ['8x7b'], added: '20251015' },
-  'stable-beluga': { pulls: 74100, tags: ['7b', '13b', '70b'], added: '20251015' },
-  'deepseek-v2.5': { pulls: 69600, tags: ['236b'], added: '20251015' },
-  'reader-lm': { pulls: 66100, tags: ['0.5b', '1.5b'], added: '20251015' },
-  'command-r7b': { pulls: 65099, tags: ['7b'], hasTools: true, added: '20251015' },
-  'phi4-mini-reasoning': { pulls: 64200, tags: ['3.8b'], added: '20251015' },
-  'llama-pro': { pulls: 60600, tags: [], added: '20251015' },
-  'shieldgemma': { pulls: 59800, tags: ['2b', '9b', '27b'], added: '20251015' },
-  'yarn-mistral': { pulls: 58800, tags: ['7b'], added: '20251015' },
-  'command-a': { pulls: 58700, tags: ['111b'], hasTools: true, added: '20251015' },
-  'mathstral': { pulls: 57300, tags: ['7b'], added: '20251015' },
-  'nexusraven': { pulls: 55800, tags: ['13b'], added: '20251015' },
-  'everythinglm': { pulls: 55700, tags: ['13b'], added: '20251015' },
-  'codeup': { pulls: 54400, tags: ['13b'], added: '20251015' },
-  'marco-o1': { pulls: 53200, tags: ['7b'], added: '20251015' },
-  'stablelm-zephyr': { pulls: 53000, tags: ['3b'], added: '20251015' },
-  'solar-pro': { pulls: 50500, tags: ['22b'], added: '20251015' },
+  'codegemma': { pulls: 1300000, tags: ['2b', '7b'], added: '20240501' },
+  'deepseek-coder-v2': { pulls: 1200000, tags: ['16b', '236b'], added: '20240628' },
+  'falcon3': { pulls: 1000000, tags: ['1b', '3b', '7b', '10b'], added: '20241210' },
+  'granite3.1-moe': { pulls: 1000000, tags: ['1b', '3b'], hasTools: true, added: '20250128' },
+  'qwen2.5vl': { pulls: 979400, tags: ['3b', '7b', '32b', '72b'], hasVision: true, added: '20251015' },
+  'orca-mini': { pulls: 925700, tags: ['3b', '7b', '13b', '70b'] },
+  'llama4': { pulls: 751000, tags: ['16x17b', '128x17b'], hasVision: true, hasTools: true, added: '20251015' },
+  'phi': { pulls: 730400, tags: ['2.7b'] },
+  'mistral-small3.2': { pulls: 716800, tags: ['24b'], hasVision: true, hasTools: true, added: '20251015' },
+  'dolphin-mixtral': { pulls: 716400, tags: ['8x7b', '8x22b'], added: '20250312' },
+  'gemma3n': { pulls: 664800, tags: ['e2b', 'e4b'], added: '20251015' },
+  'granite3.3': { pulls: 658300, tags: ['2b', '8b'], hasTools: true, added: '20251015' },
+  'cogito': { pulls: 654900, tags: ['3b', '8b', '14b', '32b', '70b'], hasTools: true, added: '20251015' },
+  'phi4-reasoning': { pulls: 621900, tags: ['14b'], added: '20251015' },
+  'openthinker': { pulls: 604300, tags: ['7b', '32b'], added: '20250219' },
+  'magistral': { pulls: 578500, tags: ['thinking', '24b'], hasTools: true, added: '20251015' },
+  'deepscaler': { pulls: 570500, tags: ['1.5b'], added: '20250219' },
+  'dolphin-phi': { pulls: 566500, tags: ['2.7b'] },
+  'dolphin-llama3': { pulls: 517500, tags: ['8b', '70b'], added: '20240501' },
+  'codestral': { pulls: 502700, tags: ['22b'], added: '20240628' },
+  'smollm': { pulls: 483800, tags: ['135m', '360m', '1.7b'], added: '20241210' },
+  'wizardlm2': { pulls: 457300, tags: ['7b', '8x22b'], added: '20240501' },
+  'phi4-mini': { pulls: 457200, tags: ['3.8b'], hasTools: true, added: '20250312' },
+  'devstral': { pulls: 422700, tags: ['24b'], hasTools: true, added: '20251015' },
+  'granite3.2-vision': { pulls: 410700, tags: ['2b'], hasVision: true, hasTools: true, added: '20250312' },
+  'dolphin-mistral': { pulls: 408900, tags: ['7b'] },
+  'command-r': { pulls: 361200, tags: ['35b'], hasTools: true, added: '20240501' },
+  'moondream': { pulls: 352200, tags: ['1.8b'], hasVision: true, added: '20240501' },
+  'deepcoder': { pulls: 351200, tags: ['1.5b', '14b'], added: '20251015' },
+  'hermes3': { pulls: 343200, tags: ['3b', '8b', '70b', '405b'], hasTools: true, added: '20241210' },
+  'mistral-small3.1': { pulls: 342900, tags: ['24b'], hasVision: true, hasTools: true, added: '20251015' },
+  'phi3.5': { pulls: 341100, tags: ['3.8b'], added: '20241210' },
+  'yi': { pulls: 307200, tags: ['6b', '9b', '34b'] },
+  'zephyr': { pulls: 293800, tags: ['7b', '141b'] },
+  'granite-code': { pulls: 293600, tags: ['3b', '8b', '20b', '34b'], added: '20240628' },
+  'mistral-large': { pulls: 265700, tags: ['123b'], hasTools: true, added: '20241210' },
+  'wizard-vicuna-uncensored': { pulls: 254900, tags: ['7b', '13b', '30b'] },
+  'exaone-deep': { pulls: 232400, tags: ['2.4b', '7.8b', '32b'], added: '20251015' },
+  'starcoder': { pulls: 228600, tags: ['1b', '3b', '7b', '15b'] },
+  'nous-hermes': { pulls: 224500, tags: ['7b', '13b'] },
+  'falcon': { pulls: 214400, tags: ['7b', '40b', '180b'] },
+  'deepseek-llm': { pulls: 211600, tags: ['7b', '67b'] },
+  'vicuna': { pulls: 206000, tags: ['7b', '13b', '33b'] },
+  'openchat': { pulls: 204600, tags: ['7b'] },
+  'deepseek-v2': { pulls: 196500, tags: ['16b', '236b'], added: '20240628' },
+  'opencoder': { pulls: 193400, tags: ['1.5b', '8b'], added: '20241210' },
+  'mistral-openorca': { pulls: 189900, tags: ['7b'] },
+  'codegeex4': { pulls: 188800, tags: ['9b'], added: '20241210' },
+  'openhermes': { pulls: 188800, tags: [] },
+  'codeqwen': { pulls: 181900, tags: ['7b'], added: '20240501' },
+  'qwen2-math': { pulls: 175400, tags: ['1.5b', '7b', '72b'], added: '20241210' },
+  'llama2-chinese': { pulls: 168800, tags: ['7b', '13b'] },
+  'aya': { pulls: 167200, tags: ['8b', '35b'], added: '20240628' },
+  'tinydolphin': { pulls: 165600, tags: ['1.1b'] },
+  'granite3.2': { pulls: 162200, tags: ['2b', '8b'], hasTools: true, added: '20250312' },
+  'glm4': { pulls: 161900, tags: ['9b'], added: '20241210' },
+  'stable-code': { pulls: 157900, tags: ['3b'] },
+  'nous-hermes2': { pulls: 152700, tags: ['10.7b', '34b'] },
+  'neural-chat': { pulls: 150400, tags: ['7b'] },
+  'wizardcoder': { pulls: 148600, tags: ['33b'] },
+  'command-r-plus': { pulls: 145700, contextWindow: 128000, tags: ['104b'], hasTools: true, added: '20240501' },
+  'bakllava': { pulls: 145400, tags: ['7b'], hasVision: true },
+  'sqlcoder': { pulls: 140300, tags: ['7b', '15b'] },
+  'stablelm2': { pulls: 135000, tags: ['1.6b', '12b'] },
+  'yi-coder': { pulls: 134100, tags: ['1.5b', '9b'], added: '20241210' },
+  'llama3-chatqa': { pulls: 132200, tags: ['8b', '70b'], added: '20240628' },
+  'llava-phi3': { pulls: 128600, tags: ['3.8b'], hasVision: true, added: '20240628' },
+  'granite3-dense': { pulls: 127700, tags: ['2b', '8b'], hasTools: true, added: '20241210' },
+  'granite3.1-dense': { pulls: 123900, tags: ['2b', '8b'], hasTools: true, added: '20250128' },
+  'wizard-math': { pulls: 122000, tags: ['7b', '13b', '70b'] },
+  'r1-1776': { pulls: 121900, tags: ['70b', '671b'], added: '20250312' },
+  'exaone3.5': { pulls: 121100, tags: ['2.4b', '7.8b', '32b'], added: '20241210' },
+  'reflection': { pulls: 120800, tags: ['70b'], added: '20241210' },
+  'llama3-gradient': { pulls: 118800, tags: ['8b', '70b'], added: '20240501' },
+  'dbrx': { pulls: 112200, tags: ['132b'], added: '20241210' },
+  'samantha-mistral': { pulls: 111100, tags: ['7b'] },
+  'nemotron-mini': { pulls: 109700, tags: ['4b'], hasTools: true, added: '20241210' },
+  'tulu3': { pulls: 107600, tags: ['8b', '70b'], added: '20241210' },
+  'starling-lm': { pulls: 104600, tags: ['7b'] },
+  'internlm2': { pulls: 103100, tags: ['1m', '1.8b', '7b', '20b'], added: '20241210' },
+  'phind-codellama': { pulls: 103000, tags: ['34b'] },
+  'solar': { pulls: 102400, tags: ['10.7b'] },
+  'xwinlm': { pulls: 101700, tags: ['7b', '13b'] },
+  'athene-v2': { pulls: 100500, tags: ['72b'], hasTools: true, added: '20251015' },
+  'llama3-groq-tool-use': { pulls: 100100, tags: ['8b', '70b'], hasTools: true, added: '20251015' },
+  'nemotron': { pulls: 97500, tags: ['70b'], hasTools: true, added: '20251015' },
+  'yarn-llama2': { pulls: 95100, tags: ['7b', '13b'], added: '20251015' },
+  'meditron': { pulls: 94000, tags: ['7b', '70b'], added: '20251015' },
+  'granite3-moe': { pulls: 91700, tags: ['1b', '3b'], hasTools: true, added: '20251015' },
+  'llama-guard3': { pulls: 90300, tags: ['1b', '8b'], added: '20251015' },
+  'wizardlm-uncensored': { pulls: 89500, tags: ['13b'], added: '20251015' },
+  'aya-expanse': { pulls: 88000, tags: ['8b', '32b'], hasTools: true, added: '20251015' },
+  'smallthinker': { pulls: 85200, tags: ['3b'], added: '20251015' },
+  'orca2': { pulls: 81500, tags: ['7b', '13b'], added: '20251015' },
+  'wizardlm': { pulls: 80100, tags: [], added: '20251015' },
+  'medllama2': { pulls: 79400, tags: ['7b'], added: '20251015' },
+  'nous-hermes2-mixtral': { pulls: 76000, tags: ['8x7b'], added: '20251015' },
+  'stable-beluga': { pulls: 74800, tags: ['7b', '13b', '70b'], added: '20251015' },
+  'deepseek-v2.5': { pulls: 70400, tags: ['236b'], added: '20251015' },
+  'command-r7b': { pulls: 69200, tags: ['7b'], hasTools: true, added: '20251015' },
+  'phi4-mini-reasoning': { pulls: 67100, tags: ['3.8b'], added: '20251015' },
+  'reader-lm': { pulls: 67100, tags: ['0.5b', '1.5b'], added: '20251015' },
+  'granite4': { pulls: 64400, tags: [], hasTools: true, added: '20251028' },
+  'llama-pro': { pulls: 61400, tags: [], added: '20251015' },
+  'shieldgemma': { pulls: 60900, tags: ['2b', '9b', '27b'], added: '20251015' },
+  'command-a': { pulls: 60200, tags: ['111b'], hasTools: true, added: '20251015' },
+  'yarn-mistral': { pulls: 59500, tags: ['7b'], added: '20251015' },
+  'mathstral': { pulls: 58400, tags: ['7b'], added: '20251015' },
+  'everythinglm': { pulls: 56600, tags: ['13b'], added: '20251015' },
+  'nexusraven': { pulls: 56500, tags: ['13b'], added: '20251015' },
+  'codeup': { pulls: 55200, tags: ['13b'], added: '20251015' },
+  'marco-o1': { pulls: 54000, tags: ['7b'], added: '20251015' },
+  'stablelm-zephyr': { pulls: 53800, tags: ['3b'], added: '20251015' },
+  'solar-pro': { pulls: 51300, tags: ['22b'], added: '20251015' },
+  'falcon2': { pulls: 50200, tags: ['11b'], added: '20251028' },
 };
-export const OLLAMA_LAST_UPDATE: string = '20251015';
+export const OLLAMA_LAST_UPDATE: string = '20251028';
 export const OLLAMA_PREV_UPDATE: string = '20251015';
@@ -1,12 +1,8 @@
 // here for reference only - for future mapping of CBA scores to the model IDs
 // const modelIdToPrefixMap: { [key: string]: string } = {
 //   // Anthropic models
-//   'Claude 3.5 Sonnet': 'claude-3-5-sonnet-20240620',
 //   'Claude 3 Opus': 'claude-3-opus-20240229',
-//   'Claude 3 Sonnet': 'claude-3-sonnet-20240229',
 //   'Claude 3 Haiku': 'claude-3-haiku-20240307',
-//   'Claude-2.1': 'claude-2.1',
-//   'Claude-2.0': 'claude-2.0',
 //   'Claude-1': '', // No exact match
 //   'Claude-Instant-1': 'claude-instant-1.2', // Closest match
 //
@@ -9,7 +9,7 @@ import { wireGroqModelsListOutputSchema } from '../groq.wiretypes';
 * Groq models.
 * - models list: https://console.groq.com/docs/models
 * - pricing: https://groq.com/pricing/
- * - updated: 2025-01-15
+ * - updated: 2025-10-28
 */
 const _knownGroqModels: ManualMappings = [

@@ -93,7 +93,7 @@ const _knownGroqModels: ManualMappings = [
    contextWindow: 131072,
    maxCompletionTokens: 65536,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    chatPrice: { input: 0.15, output: 0.75 },
+    chatPrice: { input: 0.15, output: 0.60 },
  },
  {
    idPrefix: 'openai/gpt-oss-20b',
@@ -102,7 +102,7 @@ const _knownGroqModels: ManualMappings = [
    contextWindow: 131072,
    maxCompletionTokens: 65536,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    chatPrice: { input: 0.10, output: 0.50 },
+    chatPrice: { input: 0.075, output: 0.30 },
  },

  // Production Models - SDAIA
@@ -11,7 +11,7 @@ const MISTRAL_DEV_SHOW_GAPS = Release.IsNodeDevBuild;


 // [Mistral]
-// Updated 2025-10-15
+// Updated 2025-10-28
 // - models on: https://docs.mistral.ai/getting-started/models/models_overview/
 // - pricing on: https://mistral.ai/pricing#api-pricing
 // - benchmark elo on CBA
@@ -64,7 +64,8 @@ const _knownMistralModelDetails: Record<string, {
  'mistral-small-latest': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink
  'mistral-small': { chatPrice: { input: 0.1, output: 0.3 }, hidden: true }, // symlink

-  'magistral-small-2506': { chatPrice: { input: 0.5, output: 1.5 } },
+  'magistral-small-2509': { chatPrice: { input: 0.5, output: 1.5 } }, // v25.09
+  'magistral-small-2506': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // older version
  'magistral-small-latest': { chatPrice: { input: 0.5, output: 1.5 }, hidden: true }, // symlink

  'devstral-small-2507': { chatPrice: { input: 0.1, output: 0.3 } }, // v25.07
@@ -8,7 +8,7 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [

  /* OpenPipe models - by default it's OpenAI models, through the proxy service. */

-  // OpenAI models: these work
+  // OpenAI models: pass-through at standard OpenAI rates
  {
    id: 'gpt-4o-mini-2024-07-18',
    label: '💾➜ GPT-4o Mini (2024-07-18)',
@@ -21,27 +21,16 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
    benchmark: { cbaMmlu: 82.0 },
  },
  {
-    id: 'gpt-4o-2024-05-13',
-    label: '💾➜ GPT-4o (2024-05-13)',
+    id: 'gpt-4o-2024-08-06',
+    label: '💾➜ GPT-4o (2024-08-06)',
    description: 'Advanced, multimodal flagship model that\'s cheaper and faster than GPT-4 Turbo.',
    contextWindow: 128000,
-    maxCompletionTokens: 4096,
+    maxCompletionTokens: 16384,
    trainingDataCutoff: 'Oct 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
-    chatPrice: _knownOpenAIChatModels.find(m => m.idPrefix === 'gpt-4o-2024-05-13')?.chatPrice,
+    chatPrice: _knownOpenAIChatModels.find(m => m.idPrefix === 'gpt-4o-2024-08-06')?.chatPrice,
    benchmark: { cbaElo: 1287 },
  },
-  {
-    id: 'gpt-3.5-turbo-1106',
-    label: '💾➜ GPT-3.5 Turbo (1106)',
-    description: 'GPT-3.5 Turbo model from November 2023',
-    contextWindow: 16385,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Sep 2021',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    chatPrice: _knownOpenAIChatModels.find(m => m.idPrefix === 'gpt-3.5-turbo-1106')?.chatPrice,
-    benchmark: { cbaElo: 1072 },
-  },
  {
    id: 'gpt-3.5-turbo-0125',
    label: '💾➜ GPT-3.5 Turbo (0125)',
@@ -54,55 +43,51 @@ const _knownOpenPipeChatModels: ModelDescriptionSchema[] = [
    benchmark: { cbaElo: 1105 },
  },

-  // Not supported yet "We don't support streaming responses for chat completions with Anthropic yet. Please email us at support@openpipe.ai if this is a feature you need!"
-  // {
-  //   id: 'claude-3-5-sonnet-20240620',
-  //   label: '💾➜ Claude 3.5 Sonnet',
-  //   description: 'The most intelligent Claude model',
-  //   contextWindow: 200000, // Characters
-  //   maxCompletionTokens: 8192,
-  //   trainingDataCutoff: 'Apr 2024',
-  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
-  //   pricing: { input: 3, output: 15 },
-  // },
+  // Google Gemini models: pass-through at standard Google rates
+  {
+    id: 'gemini-1.0-pro-001',
+    label: '💾➜ Gemini 1.0 Pro',
+    description: 'Google\'s Gemini 1.0 Pro model',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    id: 'gemini-1.5-flash-001',
+    label: '💾➜ Gemini 1.5 Flash',
+    description: 'Google\'s Gemini 1.5 Flash model - fast and efficient',
+    contextWindow: 1000000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
+  },

-  // Default finetune, not available at the onset
-  // {
-  //   id: 'mistral-ft-optimized-1227',
-  //   label: 'OpenPipe · Mistral FT Optimized',
-  //   description: 'OpenPipe optimized Mistral fine-tuned model',
-  //   contextWindow: 32768, // Assuming similar to Mixtral, as it's Mistral-based
-  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn], // Assuming similar to Mixtral
-  // },
-
-  // Finetune-able models, but not present
-  // {
-  //   id: 'meta-llama/Meta-Llama-3.1-8B-Instruct',
-  //   label: 'Meta-Llama 3.1 · 8B Instruct',
-  //   description: 'Meta-Llama 3.1 8B Instruct model',
-  //   contextWindow: 128000, // Inferred from Llama 3 models in the original code
-  //   maxCompletionTokens: 4096, // Inferred from Llama 3 models in the original code
-  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json], // Inferred from Llama 3 models
-  // },
-  // {
-  //   id: 'meta-llama/Meta-Llama-3.1-70B-Instruct',
-  //   label: 'Meta-Llama 3.1 · 70B Instruct',
-  //   description: 'Meta-Llama 3.1 70B Instruct model',
-  //   contextWindow: 128000, // Inferred from Llama 3 models in the original code
-  //   maxCompletionTokens: 4096, // Inferred from Llama 3 models in the original code
-  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json], // Inferred from Llama 3 models
-  // },
-  // {
-  //   id: 'mistralai/Mixtral-8x7B-Instruct-v0.1',
-  //   label: 'Mixtral · 8x7B Instruct v0.1',
-  //   description: 'Mixtral 8x7B Instruct v0.1 model',
-  //   contextWindow: 32768, // Inferred from Mixtral model in the original code
-  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn], // Inferred from Mixtral model
-  // },
+  // Hosted inference models with OpenPipe pricing
+  {
+    id: 'meta-llama/Meta-Llama-3.1-8B-Instruct',
+    label: '💾 Llama 3.1 · 8B Instruct',
+    description: 'Meta Llama 3.1 8B Instruct - hosted inference with per-token pricing',
+    contextWindow: 128000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    chatPrice: { input: 0.30, output: 0.45 },
+  },
+  {
+    id: 'meta-llama/Meta-Llama-3.1-70B-Instruct',
+    label: '💾 Llama 3.1 · 70B Instruct',
+    description: 'Meta Llama 3.1 70B Instruct - hosted inference with per-token pricing',
+    contextWindow: 128000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    chatPrice: { input: 1.80, output: 2.00 },
+  },
+  {
+    id: 'Qwen/Qwen2.5-7B-Instruct',
+    label: '💾 Qwen 2.5 · 7B Instruct',
+    description: 'Alibaba Qwen 2.5 7B Instruct - hosted inference with per-token pricing',
+    contextWindow: 131072,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    chatPrice: { input: 1.00, output: 1.50 },
+  },

 ];
 const openPipeModelFamilyOrder = [
-  'gpt-4o', 'gpt-3.5-turbo', 'mistral-ft', 'meta-llama', 'mistralai', '',
+  'gpt-4o', 'gpt-3.5-turbo', 'gemini', 'meta-llama', 'Qwen', 'mistralai', '',
 ];

 export function openPipeModelDescriptions() {
@@ -25,8 +25,7 @@ const _knownPerplexityChatModels: ModelDescriptionSchema[] = [
    chatPrice: {
      input: 2,
      output: 8,
-      // Full pricing: $2/1M input, $8/1M output, $5/1k searches, $3/1M reasoning tokens
-      // Note: Citation tokens no longer charged (removed April 2025)
+      // Full pricing: $2/1M input, $8/1M output, $2/1M citations, $5/1k searches, $3/1M reasoning tokens
    },
  },

@@ -11,7 +11,7 @@ import { openAIAccess, OpenAIAccessSchema } from '../openai.router';

 // Known xAI Models - Manual Mappings
 // List on: https://docs.x.ai/docs/models?cluster=us-east-1
-// Verified: 2025-10-15
+// Verified: 2025-10-28
 const _knownXAIChatModels: ManualMappings = [

  // Grok 4
@@ -140,9 +140,10 @@ const _knownXAIChatModels: ManualMappings = [
    interfaces: [],
  },
  {
+    hidden: true, // Not listed in official docs as of 2025-10-28
    idPrefix: 'grok-2-1212',
    label: 'Grok 2 (1212)',
-    description: 'xAI model grok-2-1212 with text input capabilities. Supports text generation with a 131,072 token context window.',
+    description: 'xAI model grok-2-1212 with text input capabilities. Supports text generation with a 131,072 token context window. (Not available as of October 2025)',
    contextWindow: 131072,
    maxCompletionTokens: undefined,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
@@ -0,0 +1,57 @@
+import type { BackendCapabilities } from '~/modules/backend/store-backend-capabilities';
+
+import type { DTTSService, TTSGenerationOptions, TTSSpeakResult, TTSVendorId, TTSVoice } from './tts.types';
+
+
+/**
+ * TTS Vendor Interface - abstraction for all TTS providers
+ * Similar to IModelVendor but adapted for TTS services
+ */
+export interface ITTSVendor<TServiceSettings extends Record<string, any> = {}, TAccess = unknown> {
+  readonly id: TTSVendorId;
+  readonly name: string;
+  readonly displayRank: number;       // Display order in UI
+  readonly location: 'local' | 'cloud';
+  readonly brandColor?: string;
+
+  // Server configuration detection
+  readonly hasServerConfigKey?: keyof BackendCapabilities;
+
+  // Capability flags
+  readonly capabilities: {
+    streaming: boolean;
+    voiceCloning?: boolean;
+    speedControl?: boolean;
+    listVoices: boolean;
+  };
+
+  /// Abstraction interface ///
+
+  /**
+   * Initialize default settings for a new service
+   */
+  initializeSetup?(): TServiceSettings;
+
+  /**
+   * Validate service setup (client-side)
+   */
+  validateSetup?(setup: TServiceSettings): boolean;
+
+  /**
+   * Get transport access configuration from setup
+   */
+  getTransportAccess(setup?: Partial<TServiceSettings>): TAccess;
+
+  /**
+   * RPC: Speak text using this vendor's TTS service
+   */
+  rpcSpeak(
+    access: TAccess,
+    options: TTSGenerationOptions,
+  ): Promise<AsyncIterable<any>>;
+
+  /**
+   * RPC: List available voices (if supported)
+   */
+  rpcListVoices?(access: TAccess): Promise<{ voices: TTSVoice[] }>;
+}
@@ -0,0 +1,195 @@
+import * as z from 'zod/v4';
+
+import { createTRPCRouter, publicProcedure } from '~/server/trpc/trpc.server';
+import { env } from '~/server/env';
+import { fetchResponseOrTRPCThrow } from '~/server/trpc/trpc.router.fetchers';
+
+
+// Configuration
+const SAFETY_TEXT_LENGTH = 4096; // OpenAI limit
+const MIN_CHUNK_SIZE = 4096; // Minimum chunk size in bytes for streaming
+
+
+// Schema definitions
+export const openaiTTSSpeechInputSchema = z.object({
+  access: z.object({
+    oaiKey: z.string().optional(),
+    oaiHost: z.string().optional(),
+    oaiOrgId: z.string().optional(),
+  }),
+  text: z.string(),
+  voice: z.enum(['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer']).default('alloy'),
+  model: z.enum(['tts-1', 'tts-1-hd']).default('tts-1'),
+  speed: z.number().min(0.25).max(4.0).optional(),
+  format: z.enum(['mp3', 'opus', 'aac', 'flac', 'wav', 'pcm']).optional(),
+  streaming: z.boolean().default(false),
+});
+
+export type OpenAITTSSpeechInputSchema = z.infer<typeof openaiTTSSpeechInputSchema>;
+
+
+export const openaiTTSRouter = createTRPCRouter({
+
+  /**
+   * Speech synthesis procedure using OpenAI TTS API
+   */
+  speech: publicProcedure
+    .input(openaiTTSSpeechInputSchema)
+    .mutation(async function* ({ input, ctx }) {
+
+      // Start streaming back
+      yield { control: 'start' };
+
+      let text = input.text;
+
+      // Safety check: trim text that's too long
+      if (text.length > SAFETY_TEXT_LENGTH) {
+        text = text.slice(0, SAFETY_TEXT_LENGTH);
+        yield { warningMessage: 'text was truncated to maximum length' };
+      }
+
+      let response: Response;
+      try {
+
+        // Prepare the upstream request
+        const { headers, url } = openaiTTSAccess(input.access);
+        const body: OpenAITTSWire.TTSRequest = {
+          input: text,
+          voice: input.voice,
+          model: input.model,
+          response_format: input.format || 'mp3',
+          ...(input.speed ? { speed: input.speed } : {}),
+        };
+
+        // Blocking fetch
+        response = await fetchResponseOrTRPCThrow({
+          url,
+          method: 'POST',
+          headers,
+          body,
+          signal: ctx.reqSignal,
+          name: 'OpenAI TTS',
+        });
+
+      } catch (error: any) {
+        yield { errorMessage: `fetch issue: ${error.message || 'Unknown error'}` };
+        return;
+      }
+
+      // If not streaming, return the entire audio
+      if (!input.streaming) {
+        const audioArrayBuffer = await response.arrayBuffer();
+        yield {
+          audio: {
+            base64: Buffer.from(audioArrayBuffer).toString('base64'),
+            contentType: response.headers.get('content-type') || 'audio/mpeg',
+          },
+        };
+        yield { control: 'end' };
+        return;
+      }
+
+      const reader = response.body?.getReader();
+      if (!reader) {
+        yield { errorMessage: 'stream issue: No reader' };
+        return;
+      }
+
+      // STREAM the audio chunks back to the client
+      try {
+
+        // Initialize a buffer to accumulate chunks
+        const accumulatedChunks: Uint8Array[] = [];
+        let accumulatedSize = 0;
+
+        // Read loop
+        while (true) {
+          const { value, done: readerDone } = await reader.read();
+          if (readerDone) break;
+          if (!value) continue;
+
+          // Accumulate chunks
+          accumulatedChunks.push(value);
+          accumulatedSize += value.length;
+
+          // When accumulated size reaches or exceeds MIN_CHUNK_SIZE, yield the chunk
+          if (accumulatedSize >= MIN_CHUNK_SIZE) {
+            yield {
+              audioChunk: {
+                base64: Buffer.concat(accumulatedChunks).toString('base64'),
+              },
+            };
+            // Reset the accumulation
+            accumulatedChunks.length = 0;
+            accumulatedSize = 0;
+          }
+        }
+
+        // If there's any remaining data, yield it as well
+        if (accumulatedSize) {
+          yield {
+            audioChunk: {
+              base64: Buffer.concat(accumulatedChunks).toString('base64'),
+            },
+          };
+        }
+      } catch (error: any) {
+        yield { errorMessage: `stream issue: ${error.message || 'Unknown error'}` };
+        return;
+      }
+
+      // End streaming
+      yield { control: 'end' };
+    }),
+
+});
+
+
+/**
+ * Helper function to construct OpenAI TTS API access details
+ */
+export function openaiTTSAccess(access: OpenAITTSSpeechInputSchema['access']): { headers: HeadersInit; url: string } {
+  // API key
+  const apiKey = (access.oaiKey || env.OPENAI_API_KEY || '').trim();
+  if (!apiKey) {
+    throw new Error('Missing OpenAI API key.');
+  }
+
+  // API host
+  let host = (access.oaiHost || env.OPENAI_API_HOST || 'api.openai.com').trim();
+  if (!host.startsWith('http')) {
+    host = `https://${host}`;
+  }
+  if (host.endsWith('/')) {
+    host = host.slice(0, -1);
+  }
+
+  // Build headers
+  const headers: HeadersInit = {
+    'Accept': 'audio/*',
+    'Content-Type': 'application/json',
+    'Authorization': `Bearer ${apiKey}`,
+  };
+
+  // Add org ID if provided
+  if (access.oaiOrgId) {
+    headers['OpenAI-Organization'] = access.oaiOrgId;
+  }
+
+  return {
+    headers,
+    url: `${host}/v1/audio/speech`,
+  };
+}
+
+
+/// OpenAI TTS API Wire Types
+export namespace OpenAITTSWire {
+  export interface TTSRequest {
+    input: string;
+    voice: 'alloy' | 'echo' | 'fable' | 'onyx' | 'nova' | 'shimmer';
+    model: 'tts-1' | 'tts-1-hd';
+    response_format?: 'mp3' | 'opus' | 'aac' | 'flac' | 'wav' | 'pcm';
+    speed?: number; // 0.25 to 4.0
+  }
+}
@@ -0,0 +1,110 @@
+//
+// WARNING: Everything here is data at rest. Know what you're doing.
+//
+
+import { create } from 'zustand';
+import { persist } from 'zustand/middleware';
+
+import type { ITTSVendor } from './ITTSVendor';
+import type { DTTSService, TTSServiceId, TTSVendorId } from './tts.types';
+
+
+/// TTSStore - a store for configured TTS services and settings
+
+export interface TTSStoreState {
+  // TTS services (configured instances of TTS vendors)
+  services: DTTSService<any>[];
+
+  // Global active service and voice
+  activeServiceId: TTSServiceId | null;
+  activeVoiceId: string | null;
+}
+
+interface TTSStoreActions {
+  // Service management
+  createService: (vendor: ITTSVendor) => DTTSService;
+  removeService: (id: TTSServiceId) => void;
+  updateServiceSettings: <TServiceSettings>(id: TTSServiceId, partialSettings: Partial<TServiceSettings>) => void;
+
+  // Active selection
+  setActiveServiceId: (id: TTSServiceId | null) => void;
+  setActiveVoiceId: (voiceId: string | null) => void;
+}
+
+
+type TTSStore = TTSStoreState & TTSStoreActions;
+
+
+export const useTTSStore = create<TTSStore>()(persist(
+  (set, get) => ({
+
+    // Initial state
+    services: [],
+    activeServiceId: null,
+    activeVoiceId: null,
+
+    // Actions
+
+    createService: (vendor: ITTSVendor) => {
+      const service: DTTSService = {
+        id: `${vendor.id}-${Date.now()}`,
+        label: vendor.name,
+        vId: vendor.id,
+        setup: vendor.initializeSetup?.() || {},
+      };
+
+      set(state => ({
+        services: [...state.services, service],
+      }));
+
+      return service;
+    },
+
+    removeService: (id: TTSServiceId) =>
+      set(state => {
+        const newServices = state.services.filter(s => s.id !== id);
+        return {
+          services: newServices,
+          // Clear active service if it was removed
+          activeServiceId: state.activeServiceId === id ? null : state.activeServiceId,
+        };
+      }),
+
+    updateServiceSettings: <TServiceSettings>(id: TTSServiceId, partialSettings: Partial<TServiceSettings>) =>
+      set(state => ({
+        services: state.services.map(service =>
+          service.id === id
+            ? { ...service, setup: { ...service.setup, ...partialSettings } }
+            : service,
+        ),
+      })),
+
+    setActiveServiceId: (id: TTSServiceId | null) =>
+      set({ activeServiceId: id }),
+
+    setActiveVoiceId: (voiceId: string | null) =>
+      set({ activeVoiceId: voiceId }),
+
+  }),
+  {
+    name: 'app-tts',
+  }),
+));
+
+
+// Helper functions for accessing TTS store
+
+export function getTTSStoreState(): TTSStoreState {
+  return useTTSStore.getState();
+}
+
+export function getTTSService(serviceId: TTSServiceId): DTTSService | null {
+  const { services } = useTTSStore.getState();
+  return services.find(s => s.id === serviceId) || null;
+}
+
+export function getActiveTTSService(): DTTSService | null {
+  const { services, activeServiceId } = useTTSStore.getState();
+  if (!activeServiceId) return null;
+  return services.find(s => s.id === activeServiceId) || null;
+}
@@ -0,0 +1,195 @@
+import { getBackendCapabilities } from '~/modules/backend/store-backend-capabilities';
+
+import { AudioLivePlayer } from '~/common/util/audio/AudioLivePlayer';
+import { AudioPlayer } from '~/common/util/audio/AudioPlayer';
+import { convert_Base64_To_UInt8Array } from '~/common/util/blobUtils';
+import { useUIPreferencesStore } from '~/common/stores/store-ui';
+
+import { SystemPurposes, type SystemPurposeId } from '~/data';
+
+import { findTTSVendor } from './vendors.registry';
+import { getActiveTTSService, getTTSService, useTTSStore } from './store-tts.ts';
+import type { TTSGenerationOptions, TTSSpeakResult, TTSServiceId } from './tts.types';
+
+
+/**
+ * Get persona-specific TTS configuration
+ */
+function getPersonaTTSConfig(personaId?: SystemPurposeId): { serviceId?: TTSServiceId; voiceId?: string } | null {
+  if (!personaId) return null;
+
+  const persona = SystemPurposes[personaId];
+  if (!persona?.voices) return null;
+
+  // Check new tts field first
+  if (persona.voices.tts?.voiceId) {
+    return {
+      voiceId: persona.voices.tts.voiceId,
+    };
+  }
+
+  // Fall back to legacy elevenLabs field for backward compatibility
+  if (persona.voices.elevenLabs?.voiceId) {
+    return {
+      voiceId: persona.voices.elevenLabs.voiceId,
+    };
+  }
+
+  return null;
+}
+
+
+/**
+ * Main TTS invocation function - vendor-agnostic
+ * Speaks text using the configured TTS service
+ */
+export async function speakText(
+  text: string,
+  options?: {
+    serviceId?: TTSServiceId;  // Override global service
+    voiceId?: string;          // Override global voice
+    personaId?: SystemPurposeId; // Use persona's voice preference
+    streaming?: boolean;
+    turbo?: boolean;
+    speed?: number;
+  },
+): Promise<TTSSpeakResult> {
+  // Early validation
+  if (!text?.trim()) {
+    return { success: false };
+  }
+
+  // 1. Resolve service
+  const { services, activeServiceId, activeVoiceId } = useTTSStore.getState();
+
+  let serviceId = options?.serviceId;
+  let voiceId = options?.voiceId;
+
+  // Check persona configuration
+  if (options?.personaId) {
+    const personaConfig = getPersonaTTSConfig(options.personaId);
+    if (personaConfig) {
+      serviceId = personaConfig.serviceId || serviceId;
+      voiceId = personaConfig.voiceId || voiceId;
+    }
+  }
+
+  // Fall back to global defaults
+  serviceId = serviceId || activeServiceId || undefined;
+  voiceId = voiceId || activeVoiceId || undefined;
+
+  if (!serviceId) {
+    console.warn('TTS: No service configured');
+    return { success: false };
+  }
+
+  const service = getTTSService(serviceId);
+  if (!service) {
+    console.warn('TTS: Service not found:', serviceId);
+    return { success: false };
+  }
+
+  // 2. Get vendor implementation
+  const vendor = findTTSVendor(service.vId);
+  if (!vendor) {
+    console.warn('TTS: Vendor not found:', service.vId);
+    return { success: false };
+  }
+
+  // 3. Get transport access
+  const access = vendor.getTransportAccess(service.setup);
+
+  // 4. Prepare generation options
+  const { preferredLanguage } = useUIPreferencesStore.getState();
+  const nonEnglish = !(preferredLanguage?.toLowerCase()?.startsWith('en'));
+
+  const generationOptions: TTSGenerationOptions = {
+    text,
+    voiceId,
+    streaming: options?.streaming ?? false,
+    turbo: options?.turbo ?? false,
+    speed: options?.speed,
+    nonEnglish,
+  };
+
+  // 5. Execute TTS
+  try {
+    const stream = await vendor.rpcSpeak(access, generationOptions);
+
+    let liveAudioPlayer: AudioLivePlayer | undefined;
+    let playbackStarted = false;
+    let audioBase64: string | undefined;
+
+    for await (const piece of stream) {
+      // Streaming audio chunk
+      if (piece.audioChunk) {
+        try {
+          if (!liveAudioPlayer) {
+            liveAudioPlayer = new AudioLivePlayer();
+          }
+
+          const chunkArray = convert_Base64_To_UInt8Array(piece.audioChunk.base64, 'tts.client (chunk)');
+          liveAudioPlayer.enqueueChunk(chunkArray.buffer);
+          playbackStarted = true;
+        } catch (audioError) {
+          console.error('TTS audio chunk error:', audioError);
+          return { success: false };
+        }
+      }
+
+      // Full audio buffer
+      else if (piece.audio) {
+        try {
+          if (!options?.streaming) {
+            audioBase64 = piece.audio.base64;
+          }
+
+          const audioArray = convert_Base64_To_UInt8Array(piece.audio.base64, 'tts.client');
+          void AudioPlayer.playBuffer(audioArray.buffer);
+          playbackStarted = true;
+        } catch (audioError) {
+          console.error('TTS audio buffer error:', audioError);
+          return { success: false };
+        }
+      }
+
+      // Errors
+      else if (piece.errorMessage) {
+        console.error('TTS error:', piece.errorMessage);
+        return { success: false, error: piece.errorMessage };
+      } else if (piece.warningMessage) {
+        console.warn('TTS warning:', piece.warningMessage);
+      } else if (piece.control === 'start' || piece.control === 'end') {
+        // Control messages - continue processing
+      }
+    }
+
+    return { success: playbackStarted, audioBase64 };
+  } catch (error) {
+    console.error('TTS playback error:', error);
+    return { success: false, error: String(error) };
+  }
+}
+
+
+/**
+ * Check if TTS is available and configured
+ */
+export function isTTSAvailable(): boolean {
+  const { services, activeServiceId } = useTTSStore.getState();
+
+  // Check if we have an active service
+  if (activeServiceId) {
+    const service = services.find(s => s.id === activeServiceId);
+    if (service) {
+      const vendor = findTTSVendor(service.vId);
+      if (vendor?.validateSetup?.(service.setup) !== false) {
+        return true;
+      }
+    }
+  }
+
+  // Check backend capabilities for server-side TTS
+  const caps = getBackendCapabilities();
+  return caps.hasVoiceElevenLabs;
+}
@@ -0,0 +1,87 @@
+import { getBackendCapabilities } from '~/modules/backend/store-backend-capabilities';
+import { getElevenLabsData } from '~/modules/elevenlabs/store-module-elevenlabs';
+import { useModelsStore } from '~/common/stores/llms/store-llms';
+
+import { findTTSVendor } from './vendors.registry';
+import { useTTSStore } from './store-tts';
+import type { TTSVendorId } from './tts.types';
+
+
+/**
+ * Migrates existing TTS configurations to the new TTS store
+ * This should be called once on app initialization
+ */
+export function migrateTTSServices() {
+  const { services, activeServiceId } = useTTSStore.getState();
+
+  // Skip if already migrated (has existing services)
+  if (services.length > 0) {
+    return;
+  }
+
+  // 1. Migrate from existing ElevenLabs configuration
+  const { elevenLabsApiKey, elevenLabsVoiceId } = getElevenLabsData();
+  const { hasVoiceElevenLabs } = getBackendCapabilities();
+
+  if (elevenLabsApiKey || hasVoiceElevenLabs) {
+    const elevenLabsVendor = findTTSVendor('elevenlabs');
+    if (elevenLabsVendor) {
+      const service = useTTSStore.getState().createService(elevenLabsVendor);
+
+      // Set up with existing credentials
+      if (elevenLabsApiKey) {
+        useTTSStore.getState().updateServiceSettings(service.id, {
+          elevenKey: elevenLabsApiKey,
+        });
+      }
+
+      // Set as active service
+      useTTSStore.getState().setActiveServiceId(service.id);
+
+      // Set default voice if available
+      if (elevenLabsVoiceId) {
+        useTTSStore.getState().setActiveVoiceId(elevenLabsVoiceId);
+      }
+
+      console.log('TTS: Migrated ElevenLabs configuration to new TTS store');
+    }
+  }
+
+  // 2. Auto-import from OpenAI LLM configuration
+  autoImportTTSFromLLMs();
+}
+
+
+/**
+ * Auto-imports TTS services from configured LLM services
+ * Creates TTS services when compatible LLM credentials are found
+ */
+export function autoImportTTSFromLLMs() {
+  const { sources } = useModelsStore.getState();
+  const { services } = useTTSStore.getState();
+
+  // Check for OpenAI LLM service
+  const openaiLLMService = sources.find(s => s.vId === 'openai');
+  if (openaiLLMService && openaiLLMService.setup?.oaiKey) {
+    // Check if we already have an OpenAI TTS service with this key
+    const existingOpenAITTS = services.find(
+      s => s.vId === 'openai' && s.setup.oaiKey === openaiLLMService.setup.oaiKey,
+    );
+
+    if (!existingOpenAITTS) {
+      const openaiTTSVendor = findTTSVendor('openai');
+      if (openaiTTSVendor) {
+        const service = useTTSStore.getState().createService(openaiTTSVendor);
+
+        // Copy credentials from LLM service
+        useTTSStore.getState().updateServiceSettings(service.id, {
+          oaiKey: openaiLLMService.setup.oaiKey,
+          oaiHost: openaiLLMService.setup.oaiHost,
+          oaiOrgId: openaiLLMService.setup.oaiOrgId,
+        });
+
+        console.log('TTS: Auto-imported OpenAI TTS service from LLM configuration');
+      }
+    }
+  }
+}
@@ -0,0 +1,65 @@
+//
+// TTS Core Types
+//
+
+export type TTSServiceId = string;
+
+export type TTSVendorId = 'elevenlabs' | 'openai';
+
+/**
+ * Audio formats supported by TTS services
+ */
+export type TTSAudioFormat = 'mp3' | 'opus' | 'aac' | 'flac' | 'wav' | 'pcm';
+
+/**
+ * Voice representation (unified across all vendors)
+ */
+export interface TTSVoice {
+  id: string;
+  name: string;
+  description?: string;
+  previewUrl?: string;
+  language?: string;
+  category?: string;
+}
+
+/**
+ * Options for TTS generation (superset of all vendor capabilities)
+ */
+export interface TTSGenerationOptions {
+  // Core parameters (all vendors)
+  text: string;
+  voiceId?: string;
+
+  // Common optional parameters
+  speed?: number;              // 0.25-4.0 (OpenAI TTS)
+  format?: TTSAudioFormat;     // Output audio format
+  streaming?: boolean;         // Enable streaming
+
+  // Advanced parameters (vendor-specific, optional)
+  turbo?: boolean;             // ElevenLabs: use turbo model
+  nonEnglish?: boolean;        // ElevenLabs: use multilingual model
+}
+
+/**
+ * Result of TTS generation
+ */
+export interface TTSSpeakResult {
+  success: boolean;
+  audioBase64?: string;        // Available when not streaming
+  error?: string;
+}
+
+/**
+ * TTS Service - configured instance of a TTS vendor
+ */
+export interface DTTSService<TServiceSettings extends object = {}> {
+  id: TTSServiceId;
+  label: string;
+
+  // service -> vendor of that service
+  vId: TTSVendorId;
+
+  // service-specific settings
+  setup: Partial<TServiceSettings>;
+}
@@ -0,0 +1,25 @@
+import { TTSVendorElevenLabs } from './vendors/elevenlabs/elevenlabs.vendor';
+import { TTSVendorOpenAI } from './vendors/openai/openai-tts.vendor';
+
+import type { ITTSVendor } from './ITTSVendor';
+import type { TTSVendorId } from './tts.types';
+
+
+/** Global: TTS Vendor Instances Registry **/
+const TTS_VENDOR_REGISTRY: Record<TTSVendorId, ITTSVendor> = {
+  elevenlabs: TTSVendorElevenLabs,
+  openai: TTSVendorOpenAI,
+} as Record<string, ITTSVendor>;
+
+
+export function findAllTTSVendors(): ITTSVendor[] {
+  const vendors = Object.values(TTS_VENDOR_REGISTRY);
+  vendors.sort((a, b) => a.displayRank - b.displayRank);
+  return vendors;
+}
+
+export function findTTSVendor<TServiceSettings extends object = {}, TAccess = unknown>(
+  vendorId?: TTSVendorId,
+): ITTSVendor<TServiceSettings, TAccess> | null {
+  return vendorId ? (TTS_VENDOR_REGISTRY[vendorId] as ITTSVendor<TServiceSettings, TAccess>) ?? null : null;
+}
@@ -0,0 +1,82 @@
+import type { BackendCapabilities } from '~/modules/backend/store-backend-capabilities';
+
+import { apiStream } from '~/common/util/trpc.client';
+
+import type { ITTSVendor } from '../../ITTSVendor';
+import type { TTSGenerationOptions, TTSVoice } from '../../tts.types';
+
+
+// ElevenLabs Service Settings
+export interface ElevenLabsServiceSettings {
+  elevenKey?: string;
+  elevenHost?: string;
+}
+
+// ElevenLabs Access (for RPC calls)
+export interface ElevenLabsAccess {
+  elevenKey?: string;
+  elevenHost?: string;
+}
+
+
+export const TTSVendorElevenLabs: ITTSVendor<ElevenLabsServiceSettings, ElevenLabsAccess> = {
+  id: 'elevenlabs',
+  name: 'ElevenLabs',
+  displayRank: 10,
+  location: 'cloud',
+  brandColor: undefined,
+
+  hasServerConfigKey: 'hasVoiceElevenLabs',
+
+  capabilities: {
+    streaming: true,
+    voiceCloning: true,
+    speedControl: false,
+    listVoices: true,
+  },
+
+  initializeSetup(): ElevenLabsServiceSettings {
+    return {
+      elevenKey: '',
+      elevenHost: '',
+    };
+  },
+
+  validateSetup(setup: ElevenLabsServiceSettings): boolean {
+    return !setup.elevenKey || setup.elevenKey.trim().length >= 32;
+  },
+
+  getTransportAccess(setup?: Partial<ElevenLabsServiceSettings>): ElevenLabsAccess {
+    return {
+      elevenKey: setup?.elevenKey,
+      elevenHost: setup?.elevenHost,
+    };
+  },
+
+  async rpcSpeak(access: ElevenLabsAccess, options: TTSGenerationOptions): Promise<AsyncIterable<any>> {
+    return apiStream.elevenlabs.speech.mutate({
+      xiKey: access.elevenKey,
+      voiceId: options.voiceId,
+      text: options.text,
+      nonEnglish: options.nonEnglish ?? false,
+      audioStreaming: options.streaming ?? false,
+      audioTurbo: options.turbo ?? false,
+    });
+  },
+
+  async rpcListVoices(access: ElevenLabsAccess): Promise<{ voices: TTSVoice[] }> {
+    const result = await (apiStream as any).elevenlabs.listVoices.query({
+      elevenKey: access.elevenKey,
+    });
+
+    return {
+      voices: result.voices.map((v: any) => ({
+        id: v.id,
+        name: v.name,
+        description: v.description || undefined,
+        previewUrl: v.previewUrl || undefined,
+        category: v.category,
+      })),
+    };
+  },
+};
@@ -0,0 +1,86 @@
+import type { BackendCapabilities } from '~/modules/backend/store-backend-capabilities';
+
+import { apiStream } from '~/common/util/trpc.client';
+
+import type { ITTSVendor } from '../../ITTSVendor';
+import type { TTSGenerationOptions, TTSVoice } from '../../tts.types';
+
+
+// OpenAI TTS Service Settings
+export interface OpenAITTSServiceSettings {
+  oaiKey?: string;
+  oaiHost?: string;
+  oaiOrgId?: string;
+}
+
+// OpenAI TTS Access (for RPC calls)
+export interface OpenAITTSAccess {
+  oaiKey?: string;
+  oaiHost?: string;
+  oaiOrgId?: string;
+}
+
+// OpenAI TTS voices (fixed list)
+export const OPENAI_TTS_VOICES: TTSVoice[] = [
+  { id: 'alloy', name: 'Alloy', description: 'Neutral and balanced' },
+  { id: 'echo', name: 'Echo', description: 'Clear and articulate' },
+  { id: 'fable', name: 'Fable', description: 'Expressive and warm' },
+  { id: 'onyx', name: 'Onyx', description: 'Deep and authoritative' },
+  { id: 'nova', name: 'Nova', description: 'Friendly and conversational' },
+  { id: 'shimmer', name: 'Shimmer', description: 'Soft and gentle' },
+];
+
+
+export const TTSVendorOpenAI: ITTSVendor<OpenAITTSServiceSettings, OpenAITTSAccess> = {
+  id: 'openai',
+  name: 'OpenAI TTS',
+  displayRank: 20,
+  location: 'cloud',
+  brandColor: '#10a37f',
+
+  hasServerConfigKey: 'hasLlmOpenAI',
+
+  capabilities: {
+    streaming: true,
+    voiceCloning: false,
+    speedControl: true,
+    listVoices: true,
+  },
+
+  initializeSetup(): OpenAITTSServiceSettings {
+    return {
+      oaiKey: '',
+      oaiHost: '',
+      oaiOrgId: '',
+    };
+  },
+
+  validateSetup(setup: OpenAITTSServiceSettings): boolean {
+    return !setup.oaiKey || setup.oaiKey.trim().startsWith('sk-');
+  },
+
+  getTransportAccess(setup?: Partial<OpenAITTSServiceSettings>): OpenAITTSAccess {
+    return {
+      oaiKey: setup?.oaiKey,
+      oaiHost: setup?.oaiHost,
+      oaiOrgId: setup?.oaiOrgId,
+    };
+  },
+
+  async rpcSpeak(access: OpenAITTSAccess, options: TTSGenerationOptions): Promise<AsyncIterable<any>> {
+    return apiStream.tts.openai.speech.mutate({
+      access,
+      text: options.text,
+      voice: options.voiceId || 'alloy',
+      model: 'tts-1',
+      speed: options.speed,
+      format: options.format,
+      streaming: options.streaming ?? false,
+    });
+  },
+
+  async rpcListVoices(access: OpenAITTSAccess): Promise<{ voices: TTSVoice[] }> {
+    // OpenAI has a fixed set of voices
+    return { voices: OPENAI_TTS_VOICES };
+  },
+};
@@ -35,10 +35,21 @@ function selectRetryProfile(error: TRPCFetcherError | unknown): RetryProfile | n
    return RETRY_PROFILES.network; // DNS, TCP, timeouts, ... doesn't connect

  if (error.category === 'http' && error.httpStatus) {
+    // 429 Too Many Requests: distinguish quota errors (don't retry) from rate limits (retry)
+    if (error.httpStatus === 429) {
+      const isQuotaError = /quota|billing/i.test(error.message);
+      if (isQuotaError) {
+        if (AIX_DEBUG_SERVER_RETRY)
+          console.log(`[fetchers.retrier] Detected quota/billing error - will not retry`);
+        return null; // Don't retry quota/billing errors - user needs to upgrade plan
+      }
+      return RETRY_PROFILES.server; // Retry temporary rate limits
+    }
+
+    // retriable server errors
    const retryCodes = [
-      429, // Too Many Requests
+      503, // Service Unavailable <- main one to retry
      502, // Bad Gateway
-      503, // Service Unavailable
    ];
    if (retryCodes.includes(error.httpStatus))
      return RETRY_PROFILES.server;
@@ -8,6 +8,7 @@ import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.ro
 import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
 import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
 import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
+import { openaiTTSRouter } from '~/modules/tts/server/openai-tts.router';
 import { youtubeRouter } from '~/modules/youtube/youtube.router';

 /**
@@ -22,6 +23,9 @@ export const appRouterEdge = createTRPCRouter({
  llmGemini: llmGeminiRouter,
  llmOllama: llmOllamaRouter,
  llmOpenAI: llmOpenAIRouter,
+  tts: createTRPCRouter({
+    openai: openaiTTSRouter,
+  }),
  youtube: youtubeRouter,
 });
Author	SHA1	Message	Date
claude[bot]	04a83247ee	feat: Implement TTS vendor abstraction system Adds support for multiple TTS providers (OpenAI, ElevenLabs) with vendor abstraction pattern similar to LLM vendors. Core changes: - Created /src/modules/tts/ module with vendor abstraction - Implemented ITTSVendor interface for unified TTS API - Added vendor implementations for ElevenLabs and OpenAI TTS - Created store-tts.ts for service and voice configuration - Implemented unified tts.client.ts for vendor-agnostic speech - Added OpenAI TTS tRPC router with streaming support - Updated PersonaChatMessageSpeak to use new TTS client - Added migration logic for existing ElevenLabs configs - Updated data.ts to support new voice configuration format Technical details: - Service-scoped pattern: activeServiceId + activeVoiceId - Backward compatible with existing elevenLabs voice configs - Auto-import capability from LLM configurations - Supports streaming and non-streaming TTS - Vendor-specific features handled gracefully Relates to #858 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>	2025-10-29 13:31:32 +00:00
Enrico Ros	98bd3d6da0	LLMs: Ollama: Update models	2025-10-28 16:36:43 -07:00
Enrico Ros	cd5ec8d295	LLMs: Perplexity: Update models	2025-10-28 16:34:24 -07:00
Enrico Ros	f91c6456bd	LLMs: xAI: Update models	2025-10-28 16:31:53 -07:00
Enrico Ros	67af87968e	workflows: CC: ollama update	2025-10-28 16:30:48 -07:00
Enrico Ros	58ea3e1b35	workflows: CC: permissions	2025-10-28 16:27:15 -07:00
Enrico Ros	a9435c10e8	LLMs: OpenPipe: Update models	2025-10-28 16:23:40 -07:00
Enrico Ros	a86860fe76	LLMs: Groq: Update models	2025-10-28 16:19:40 -07:00
Enrico Ros	a3d707f78a	LLMs: Mistral: Update models	2025-10-28 16:17:34 -07:00
Enrico Ros	c502426249	LLMs: Anthropic: Update models	2025-10-28 16:17:06 -07:00
Enrico Ros	2fb5ffcecf	LLMs: Anthropic: remove retired Claude 2 models	2025-10-28 16:09:36 -07:00
Enrico Ros	6d995c1253	LLMs: Anthropic: remove retired Sonnet 3 models	2025-10-28 16:08:39 -07:00
Enrico Ros	a860c1c490	LLMs: Anthropic: remove retired Sonnet 3.5 models - So long and thanks!!	2025-10-28 16:06:42 -07:00
Enrico Ros	481d9cc745	LLMs: Anthropic: only display 'obsoleted models' in	2025-10-28 16:03:02 -07:00
Enrico Ros	7e53a7bc2b	Server: tRPC: Retriers: carve0out 429 quota	2025-10-28 15:59:05 -07:00
Enrico Ros	4df10e3782	Lint	2025-10-28 15:59:05 -07:00
Enrico Ros	396da65178	AIX: OpenRouter: don't display processing messages	2025-10-28 15:49:37 -07:00
Enrico Ros	87e8faf383	workflows: docker: limit to 1hr	2025-10-28 13:11:28 -07:00
Enrico Ros	9eb3e6d398	workflows: CC: raise to 30min	2025-10-28 13:11:21 -07:00