1.16.9 Release

Remove v1-dev, fully absorbed into v2-dev.
Gemini: relax parser - Fixes #700
2026-05-10 21:50:14 -07:00 · 2025-01-21 18:09:37 -08:00 · 2025-01-21 18:05:18 -08:00 · 2024-12-19 01:09:40 -08:00 · 2024-12-19 01:08:41 -08:00 · 2024-11-06 16:37:18 -08:00
80 changed files with 1506 additions and 527 deletions
@@ -51,8 +51,7 @@ jobs:
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
-            type=raw,value=development,enable=${{ github.ref == 'refs/heads/main' }}
-            type=raw,value=stable,enable=${{ github.ref == 'refs/heads/main-stable' }}
+            type=raw,value=stable,enable=${{ github.ref == 'refs/heads/v1-stable' }}
            type=ref,event=tag  # Use the tag name as a tag for tag builds
            type=semver,pattern={{version}}  # Generate semantic versioning tags for tag builds
            type=sha # Just in case none of the above applies
@@ -11,19 +11,42 @@ Stay ahead of the curve with big-AGI. 🚀 Pros & Devs love big-AGI. 🤖

 [![Official Website](https://img.shields.io/badge/BIG--AGI.com-%23096bde?style=for-the-badge&logo=vercel&label=launch)](https://big-agi.com)

+> 🚀 Big-AGI 2 is launching Q4 2024. Be the first to experience it before the public release.
+>
+> 👉 [Apply for Early Access](https://y2rjg0zillz.typeform.com/to/ZSADpr5u?utm_source=gh-stable&utm_medium=readme&utm_campaign=ea2)
+
 Or fork & run on Vercel

 [![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-AGI)

-## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2) 👉 [installation](docs/installation.md) 👉 [documentation](docs/README.md)
+### New Version

-> Note: bigger better features (incl. Beam-2) are being cooked outside of `main`.
+This repository contains two main versions:

-[//]: # (big-AGI is an open book; see the **[ready-to-ship and future ideas]&#40;https://github.com/users/enricoros/projects/4/views/2&#41;** in our open roadmap)
+- Big-AGI 2: next-generation, bringing the most advanced AI experience
+  - `v2-dev`: V2 development branch, the exciting one, future default
+- Big-AGI Stable: as deployed on big-agi.com
+  - `v1-stable`: Current stable version & Docker 'latest' tag

-### What's New in 1.16.1 · May 13, 2024 (minor release, models support)
+Note: After the V2 release in Q4, `v2/dev` will become the default branch and `v1/dev` will reach EOL.

- Support for the new OpenAI GPT-4o 2024-05-13 model
+### Quick links: 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2) 👉 [installation](docs/installation.md) 👉 [documentation](docs/README.md)
+
+### What's New in 1.16.1...1.16.9 · Jan 21, 2025 (patch releases)
+
+- 1.16.9: Docker Gemini fix (R1 models are supported in Big-AGI 2)
+- 1.16.8: OpenAI ChatGPT-4o Latest (o1 models are supported in Big-AGI 2)
+- 1.16.7: OpenAI support for GPT-4o 2024-08-06
+- 1.16.6: Groq support for Llama 3.1 models
+- 1.16.5: GPT-4o Mini support
+- 1.16.4: 8192 tokens support for Claude 3.5 Sonnet
+- 1.16.3: Anthropic Claude 3.5 Sonnet model support
+- 1.16.2: Improve web downloads, as text, markdwon, or HTML
+- 1.16.2: Proper support for Gemini models
+- 1.16.2: Added the latest Mistral model
+- 1.16.2: Tokenizer support for gpt-4o
+- 1.16.2: Updates to Beam
+- 1.16.1: Support for the new OpenAI GPT-4o 2024-05-13 model

 ### What's New in 1.16.0 · May 9, 2024 · Crystal Clear

@@ -10,9 +10,21 @@ by release.
 - milestone: [1.17.0](https://github.com/enricoros/big-agi/milestone/17)
 - work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)

-### What's New in 1.16.1 · May 13, 2024 (minor release, models support)
+### What's New in 1.16.1...1.16.9 · Jan 21, 2025 (patch releases)

- Support for the new OpenAI GPT-4o 2024-05-13 model
+- 1.16.9: Docker Gemini fix (R1 models are supported in Big-AGI 2)
+- 1.16.8: OpenAI ChatGPT-4o Latest (o1 models are supported in Big-AGI 2)
+- 1.16.7: OpenAI support for GPT-4o 2024-08-06
+- 1.16.6: Groq support for Llama 3.1 models
+- 1.16.5: GPT-4o Mini support
+- 1.16.4: 8192 tokens support for Claude 3.5 Sonnet
+- 1.16.3: Anthropic Claude 3.5 Sonnet model support
+- 1.16.2: Improve web downloads, as text, markdwon, or HTML
+- 1.16.2: Proper support for Gemini models
+- 1.16.2: Added the latest Mistral model
+- 1.16.2: Tokenizer support for gpt-4o
+- 1.16.2: Updates to Beam
+- 1.16.1: Support for the new OpenAI GPT-4o 2024-05-13 model

 ### What's New in 1.16.0 · May 9, 2024 · Crystal Clear

@@ -29,6 +29,7 @@
        "@vercel/analytics": "^1.2.2",
        "@vercel/speed-insights": "^1.0.10",
        "browser-fs-access": "^0.35.0",
+        "cheerio": "^1.0.0-rc.12",
        "eventsource-parser": "^1.1.2",
        "idb-keyval": "^6.2.1",
        "next": "~14.1.4",
@@ -51,7 +52,8 @@
        "sharp": "^0.33.3",
        "superjson": "^2.2.1",
        "tesseract.js": "^5.1.0",
-        "tiktoken": "^1.0.14",
+        "tiktoken": "^1.0.15",
+        "turndown": "^7.2.0",
        "uuid": "^9.0.1",
        "zod": "^3.23.8",
        "zustand": "^4.5.2"
@@ -68,6 +70,7 @@
        "@types/react-dom": "^18.3.0",
        "@types/react-katex": "^3.0.4",
        "@types/react-timeago": "^4.1.7",
+        "@types/turndown": "^5.0.4",
        "@types/uuid": "^9.0.8",
        "eslint": "^8.57.0",
        "eslint-config-next": "^14.2.3",
@@ -76,7 +79,7 @@
        "typescript": "^5.4.5"
      },
      "engines": {
-        "node": "^20.0.0 || ^18.0.0"
+        "node": "^22.0.0 || ^20.0.0 || ^18.0.0"
      }
    },
    "node_modules/@babel/code-frame": {
@@ -1024,6 +1027,11 @@
        "node-pre-gyp": "bin/node-pre-gyp"
      }
    },
+    "node_modules/@mixmark-io/domino": {
+      "version": "2.2.0",
+      "resolved": "https://registry.npmjs.org/@mixmark-io/domino/-/domino-2.2.0.tgz",
+      "integrity": "sha512-Y28PR25bHXUg88kCV7nivXrP2Nj2RueZ3/l/jdx6J9f8J4nsEGcgX0Qe6lt7Pa+J79+kPiJU3LguR6O/6zrLOw=="
+    },
    "node_modules/@mui/base": {
      "version": "5.0.0-beta.42",
      "resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.42.tgz",
@@ -2082,6 +2090,12 @@
        "@types/react": "*"
      }
    },
+    "node_modules/@types/turndown": {
+      "version": "5.0.4",
+      "resolved": "https://registry.npmjs.org/@types/turndown/-/turndown-5.0.4.tgz",
+      "integrity": "sha512-28GI33lCCkU4SGH1GvjDhFgOVr+Tym4PXGBIU1buJUa6xQolniPArtUT+kv42RR2N9MsMLInkr904Aq+ESHBJg==",
+      "dev": true
+    },
    "node_modules/@types/unist": {
      "version": "3.0.2",
      "resolved": "https://registry.npmjs.org/@types/unist/-/unist-3.0.2.tgz",
@@ -2663,6 +2677,11 @@
      "resolved": "https://registry.npmjs.org/bmp-js/-/bmp-js-0.1.0.tgz",
      "integrity": "sha512-vHdS19CnY3hwiNdkaqk93DvjVLfbEcI8mys4UjuWrlX1haDmroo8o4xCzh4wD6DGV6HxRCyauwhHRqMTfERtjw=="
    },
+    "node_modules/boolbase": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/boolbase/-/boolbase-1.0.0.tgz",
+      "integrity": "sha512-JZOSA7Mo9sNGB8+UjSgzdLtokWAky1zbztM3WRLCbZ70/3cTANmQmOdR7y2g+J0e2WXywy1yS468tY+IruqEww=="
+    },
    "node_modules/brace-expansion": {
      "version": "1.1.11",
      "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.11.tgz",
@@ -2828,6 +2847,42 @@
        "url": "https://github.com/sponsors/wooorm"
      }
    },
+    "node_modules/cheerio": {
+      "version": "1.0.0-rc.12",
+      "resolved": "https://registry.npmjs.org/cheerio/-/cheerio-1.0.0-rc.12.tgz",
+      "integrity": "sha512-VqR8m68vM46BNnuZ5NtnGBKIE/DfN0cRIzg9n40EIq9NOv90ayxLBXA8fXC5gquFRGJSTRqBq25Jt2ECLR431Q==",
+      "dependencies": {
+        "cheerio-select": "^2.1.0",
+        "dom-serializer": "^2.0.0",
+        "domhandler": "^5.0.3",
+        "domutils": "^3.0.1",
+        "htmlparser2": "^8.0.1",
+        "parse5": "^7.0.0",
+        "parse5-htmlparser2-tree-adapter": "^7.0.0"
+      },
+      "engines": {
+        "node": ">= 6"
+      },
+      "funding": {
+        "url": "https://github.com/cheeriojs/cheerio?sponsor=1"
+      }
+    },
+    "node_modules/cheerio-select": {
+      "version": "2.1.0",
+      "resolved": "https://registry.npmjs.org/cheerio-select/-/cheerio-select-2.1.0.tgz",
+      "integrity": "sha512-9v9kG0LvzrlcungtnJtpGNxY+fzECQKhK4EGJX2vByejiMX84MFNQw4UxPJl3bFbTMw+Dfs37XaIkCwTZfLh4g==",
+      "dependencies": {
+        "boolbase": "^1.0.0",
+        "css-select": "^5.1.0",
+        "css-what": "^6.1.0",
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.3",
+        "domutils": "^3.0.1"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/fb55"
+      }
+    },
    "node_modules/chownr": {
      "version": "2.0.0",
      "resolved": "https://registry.npmjs.org/chownr/-/chownr-2.0.0.tgz",
@@ -2986,6 +3041,32 @@
        "tiny-invariant": "^1.0.6"
      }
    },
+    "node_modules/css-select": {
+      "version": "5.1.0",
+      "resolved": "https://registry.npmjs.org/css-select/-/css-select-5.1.0.tgz",
+      "integrity": "sha512-nwoRF1rvRRnnCqqY7updORDsuqKzqYJ28+oSMaJMMgOauh3fvwHqMS7EZpIPqK8GL+g9mKxF1vP/ZjSeNjEVHg==",
+      "dependencies": {
+        "boolbase": "^1.0.0",
+        "css-what": "^6.1.0",
+        "domhandler": "^5.0.2",
+        "domutils": "^3.0.1",
+        "nth-check": "^2.0.1"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/fb55"
+      }
+    },
+    "node_modules/css-what": {
+      "version": "6.1.0",
+      "resolved": "https://registry.npmjs.org/css-what/-/css-what-6.1.0.tgz",
+      "integrity": "sha512-HTUrgRJ7r4dsZKU6GjmpfRK1O76h97Z8MfS1G0FozR+oF2kG6Vfe8JE6zwrkbxigziPHinCJ+gCPjA9EaBDtRw==",
+      "engines": {
+        "node": ">= 6"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/fb55"
+      }
+    },
    "node_modules/csstype": {
      "version": "3.1.3",
      "resolved": "https://registry.npmjs.org/csstype/-/csstype-3.1.3.tgz",
@@ -3214,6 +3295,57 @@
        "csstype": "^3.0.2"
      }
    },
+    "node_modules/dom-serializer": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/dom-serializer/-/dom-serializer-2.0.0.tgz",
+      "integrity": "sha512-wIkAryiqt/nV5EQKqQpo3SToSOV9J0DnbJqwK7Wv/Trc92zIAYZ4FlMu+JPFW1DfGFt81ZTCGgDEabffXeLyJg==",
+      "dependencies": {
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.2",
+        "entities": "^4.2.0"
+      },
+      "funding": {
+        "url": "https://github.com/cheeriojs/dom-serializer?sponsor=1"
+      }
+    },
+    "node_modules/domelementtype": {
+      "version": "2.3.0",
+      "resolved": "https://registry.npmjs.org/domelementtype/-/domelementtype-2.3.0.tgz",
+      "integrity": "sha512-OLETBj6w0OsagBwdXnPdN0cnMfF9opN69co+7ZrbfPGrdpPVNBUj02spi6B1N7wChLQiPn4CSH/zJvXw56gmHw==",
+      "funding": [
+        {
+          "type": "github",
+          "url": "https://github.com/sponsors/fb55"
+        }
+      ]
+    },
+    "node_modules/domhandler": {
+      "version": "5.0.3",
+      "resolved": "https://registry.npmjs.org/domhandler/-/domhandler-5.0.3.tgz",
+      "integrity": "sha512-cgwlv/1iFQiFnU96XXgROh8xTeetsnJiDsTc7TYCLFd9+/WNkIqPTxiM/8pSd8VIrhXGTf1Ny1q1hquVqDJB5w==",
+      "dependencies": {
+        "domelementtype": "^2.3.0"
+      },
+      "engines": {
+        "node": ">= 4"
+      },
+      "funding": {
+        "url": "https://github.com/fb55/domhandler?sponsor=1"
+      }
+    },
+    "node_modules/domutils": {
+      "version": "3.1.0",
+      "resolved": "https://registry.npmjs.org/domutils/-/domutils-3.1.0.tgz",
+      "integrity": "sha512-H78uMmQtI2AhgDJjWeQmHwJJ2bLPD3GMmO7Zja/ZZh84wkm+4ut+IUnUdRa8uCGX88DiVx1j6FRe1XfxEgjEZA==",
+      "dependencies": {
+        "dom-serializer": "^2.0.0",
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.3"
+      },
+      "funding": {
+        "url": "https://github.com/fb55/domutils?sponsor=1"
+      }
+    },
    "node_modules/duplexer": {
      "version": "0.1.2",
      "resolved": "https://registry.npmjs.org/duplexer/-/duplexer-0.1.2.tgz",
@@ -4660,6 +4792,24 @@
        "url": "https://opencollective.com/unified"
      }
    },
+    "node_modules/htmlparser2": {
+      "version": "8.0.2",
+      "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-8.0.2.tgz",
+      "integrity": "sha512-GYdjWKDkbRLkZ5geuHs5NY1puJ+PXwP7+fHPRz06Eirsb9ugf6d8kkXav6ADhcODhFFPMIXyxkxSuMf3D6NCFA==",
+      "funding": [
+        "https://github.com/fb55/htmlparser2?sponsor=1",
+        {
+          "type": "github",
+          "url": "https://github.com/sponsors/fb55"
+        }
+      ],
+      "dependencies": {
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.3",
+        "domutils": "^3.0.1",
+        "entities": "^4.4.0"
+      }
+    },
    "node_modules/https-proxy-agent": {
      "version": "5.0.1",
      "resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-5.0.1.tgz",
@@ -6546,6 +6696,17 @@
      "resolved": "https://registry.npmjs.org/nprogress/-/nprogress-0.2.0.tgz",
      "integrity": "sha512-I19aIingLgR1fmhftnbWWO3dXc0hSxqHQHQb3H8m+K3TnEn/iSeTZZOyvKXWqQESMwuUVnatlCnZdLBZZt2VSA=="
    },
+    "node_modules/nth-check": {
+      "version": "2.1.1",
+      "resolved": "https://registry.npmjs.org/nth-check/-/nth-check-2.1.1.tgz",
+      "integrity": "sha512-lqjrjmaOoAnWfMmBPL+XNnynZh2+swxiX3WUE0s4yEHI6m+AwrK2UZOimIRl3X/4QctVqS8AiZjFqyOGrMXb/w==",
+      "dependencies": {
+        "boolbase": "^1.0.0"
+      },
+      "funding": {
+        "url": "https://github.com/fb55/nth-check?sponsor=1"
+      }
+    },
    "node_modules/object-assign": {
      "version": "4.1.1",
      "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
@@ -6805,6 +6966,18 @@
        "url": "https://github.com/inikulin/parse5?sponsor=1"
      }
    },
+    "node_modules/parse5-htmlparser2-tree-adapter": {
+      "version": "7.0.0",
+      "resolved": "https://registry.npmjs.org/parse5-htmlparser2-tree-adapter/-/parse5-htmlparser2-tree-adapter-7.0.0.tgz",
+      "integrity": "sha512-B77tOZrqqfUfnVcOrUvfdLbz4pu4RopLD/4vmu3HUPswwTA8OH0EMW9BlWR2B0RCoiZRAHEUu7IxeP1Pd1UU+g==",
+      "dependencies": {
+        "domhandler": "^5.0.2",
+        "parse5": "^7.0.0"
+      },
+      "funding": {
+        "url": "https://github.com/inikulin/parse5?sponsor=1"
+      }
+    },
    "node_modules/path-exists": {
      "version": "4.0.0",
      "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-4.0.0.tgz",
@@ -8180,9 +8353,9 @@
      }
    },
    "node_modules/tiktoken": {
-      "version": "1.0.14",
-      "resolved": "https://registry.npmjs.org/tiktoken/-/tiktoken-1.0.14.tgz",
-      "integrity": "sha512-g5zd5r/DoH8Kw0fiYbYpVhb6WO8BHO1unXqmBBWKwoT17HwSounnDtMDFUKm2Pko8U47sjQarOe+9aUrnqmmTg=="
+      "version": "1.0.15",
+      "resolved": "https://registry.npmjs.org/tiktoken/-/tiktoken-1.0.15.tgz",
+      "integrity": "sha512-sCsrq/vMWUSEW29CJLNmPvWxlVp7yh2tlkAjpJltIKqp5CKf98ZNpdeHRmAlPVFlGEbswDc6SmI8vz64W/qErw=="
    },
    "node_modules/tiny-invariant": {
      "version": "1.3.3",
@@ -8269,6 +8442,14 @@
      "resolved": "https://registry.npmjs.org/tslib/-/tslib-2.6.2.tgz",
      "integrity": "sha512-AEYxH93jGFPn/a2iVAwW87VuUIkR1FVUKB77NwMF7nBTDkDrrT/Hpt/IrCJ0QXhW27jTBDcf5ZY7w6RiqTMw2Q=="
    },
+    "node_modules/turndown": {
+      "version": "7.2.0",
+      "resolved": "https://registry.npmjs.org/turndown/-/turndown-7.2.0.tgz",
+      "integrity": "sha512-eCZGBN4nNNqM9Owkv9HAtWRYfLA4h909E/WGAWWBpmB275ehNhZyk87/Tpvjbp0jjNl9XwCsbe6bm6CqFsgD+A==",
+      "dependencies": {
+        "@mixmark-io/domino": "^2.2.0"
+      }
+    },
    "node_modules/type-check": {
      "version": "0.4.0",
      "resolved": "https://registry.npmjs.org/type-check/-/type-check-0.4.0.tgz",
@@ -38,6 +38,7 @@
    "@vercel/analytics": "^1.2.2",
    "@vercel/speed-insights": "^1.0.10",
    "browser-fs-access": "^0.35.0",
+    "cheerio": "^1.0.0-rc.12",
    "eventsource-parser": "^1.1.2",
    "idb-keyval": "^6.2.1",
    "next": "~14.1.4",
@@ -60,7 +61,8 @@
    "sharp": "^0.33.3",
    "superjson": "^2.2.1",
    "tesseract.js": "^5.1.0",
-    "tiktoken": "^1.0.14",
+    "tiktoken": "^1.0.15",
+    "turndown": "^7.2.0",
    "uuid": "^9.0.1",
    "zod": "^3.23.8",
    "zustand": "^4.5.2"
@@ -77,6 +79,7 @@
    "@types/react-dom": "^18.3.0",
    "@types/react-katex": "^3.0.4",
    "@types/react-timeago": "^4.1.7",
+    "@types/turndown": "^5.0.4",
    "@types/uuid": "^9.0.8",
    "eslint": "^8.57.0",
    "eslint-config-next": "^14.2.3",
@@ -85,6 +88,6 @@
    "typescript": "^5.4.5"
  },
  "engines": {
-    "node": "^20.0.0 || ^18.0.0"
+    "node": "^22.0.0 || ^20.0.0 || ^18.0.0"
  }
 }
@@ -77,9 +77,12 @@ function AppShareTarget() {
      setIsDownloading(true);
      callBrowseFetchPage(intentURL)
        .then(page => {
-          if (page.stopReason !== 'error')
-            queueComposerTextAndLaunchApp('\n\n```' + intentURL + '\n' + page.content + '\n```\n');
-          else
+          if (page.stopReason !== 'error') {
+            let pageContent = page.content.markdown || page.content.text || page.content.html || '';
+            if (pageContent)
+              pageContent = '\n\n```' + intentURL + '\n' + pageContent + '\n```\n';
+            queueComposerTextAndLaunchApp(pageContent);
+          } else
            setErrorMessage('Could not read any data' + page.error ? ': ' + page.error : '');
        })
        .catch(error => setErrorMessage(error?.message || error || 'Unknown error'))
@@ -1,5 +1,5 @@
 import * as React from 'react';
-import { shallow } from 'zustand/shallow';
+import { useShallow } from 'zustand/react/shallow';

 import { Box, Card, ListDivider, ListItemDecorator, MenuItem, Switch, Typography } from '@mui/joy';
 import ArrowBackIcon from '@mui/icons-material/ArrowBack';
@@ -99,7 +99,7 @@ export function Telephone(props: {

  // external state
  const { chatLLMId, chatLLMDropdown } = useChatLLMDropdown();
-  const { chatTitle, reMessages } = useChatStore(state => {
+  const { chatTitle, reMessages } = useChatStore(useShallow(state => {
    const conversation = props.callIntent.conversationId
      ? state.conversations.find(conversation => conversation.id === props.callIntent.conversationId) ?? null
      : null;
@@ -107,7 +107,7 @@ export function Telephone(props: {
      chatTitle: conversation ? conversationTitle(conversation) : null,
      reMessages: conversation ? conversation.messages : null,
    };
-  }, shallow);
+  }));
  const persona = SystemPurposes[props.callIntent.personaId as SystemPurposeId] ?? undefined;
  const personaCallStarters = persona?.call?.starters ?? undefined;
  const personaVoiceId = overridePersonaVoice ? undefined : (persona?.voices?.elevenLabs?.voiceId ?? undefined);
@@ -225,7 +225,7 @@ export function Telephone(props: {
    let finalText = '';
    let error: any | null = null;
    setPersonaTextInterim('💭...');
-    llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, ({ textSoFar }) => {
+    llmStreamingChatGenerate(chatLLMId, callPrompt, 'call', callMessages[0].id, null, null, responseAbortController.current.signal, ({ textSoFar }) => {
      const text = textSoFar?.trim();
      if (text) {
        finalText = text;
@@ -277,7 +277,7 @@ export function AppChat() {
    const conversation = getConversation(conversationId);
    if (!conversation)
      return;
-    const imaginedPrompt = await imaginePromptFromText(messageText) || 'An error sign.';
+    const imaginedPrompt = await imaginePromptFromText(messageText, conversationId) || 'An error sign.';
    await handleExecuteAndOutcome('generate-image', conversationId, [
      ...conversation.messages,
      createDMessage('user', imaginedPrompt),
@@ -310,7 +310,7 @@ function ChatDrawer(props: {
              bottomBarBasis={filteredChatsBarBasis}
              onConversationActivate={handleConversationActivate}
              onConversationBranch={onConversationBranch}
-              onConversationDelete={handleConversationDeleteNoConfirmation}
+              onConversationDeleteNoConfirmation={handleConversationDeleteNoConfirmation}
              onConversationExport={onConversationsExportDialog}
              onConversationFolderChange={handleConversationFolderChange}
            />
@@ -42,7 +42,7 @@ export const ChatDrawerItemMemo = React.memo(ChatDrawerItem, (prev, next) =>
  prev.bottomBarBasis === next.bottomBarBasis &&
  prev.onConversationActivate === next.onConversationActivate &&
  prev.onConversationBranch === next.onConversationBranch &&
-  prev.onConversationDelete === next.onConversationDelete &&
+  prev.onConversationDeleteNoConfirmation === next.onConversationDeleteNoConfirmation &&
  prev.onConversationExport === next.onConversationExport &&
  prev.onConversationFolderChange === next.onConversationFolderChange,
 );
@@ -76,7 +76,7 @@ function ChatDrawerItem(props: {
  bottomBarBasis: number,
  onConversationActivate: (conversationId: DConversationId, closeMenu: boolean) => void,
  onConversationBranch: (conversationId: DConversationId, messageId: string | null) => void,
-  onConversationDelete: (conversationId: DConversationId) => void,
+  onConversationDeleteNoConfirmation: (conversationId: DConversationId) => void,
  onConversationExport: (conversationId: DConversationId, exportAll: boolean) => void,
  onConversationFolderChange: (folderChangeRequest: FolderChangeRequest) => void,
 }) {
@@ -155,7 +155,16 @@ function ChatDrawerItem(props: {

  // Delete

-  const handleDeleteButtonShow = React.useCallback(() => setDeleteArmed(true), []);
+  const { onConversationDeleteNoConfirmation } = props;
+  const handleDeleteButtonShow = React.useCallback((event: React.MouseEvent) => {
+    // special case: if 'Shift' is pressed, delete immediately
+    if (event.shiftKey) {
+      event.stopPropagation();
+      onConversationDeleteNoConfirmation(conversationId);
+      return;
+    }
+    setDeleteArmed(true);
+  }, [conversationId, onConversationDeleteNoConfirmation]);

  const handleDeleteButtonHide = React.useCallback(() => setDeleteArmed(false), []);

@@ -163,9 +172,9 @@ function ChatDrawerItem(props: {
    if (deleteArmed) {
      setDeleteArmed(false);
      event.stopPropagation();
-      props.onConversationDelete(conversationId);
+      onConversationDeleteNoConfirmation(conversationId);
    }
-  }, [conversationId, deleteArmed, props]);
+  }, [conversationId, deleteArmed, onConversationDeleteNoConfirmation]);


  const textSymbol = SystemPurposes[systemPurposeId]?.symbol || '❓';
@@ -58,16 +58,12 @@ export async function attachmentLoadInputAsync(source: Readonly<AttachmentSource
      edit({ label: source.refUrl, ref: source.refUrl });
      try {
        const page = await callBrowseFetchPage(source.url);
-        if (page.content) {
-          edit({
-            input: {
-              mimeType: 'text/plain',
-              data: page.content,
-              dataSize: page.content.length,
-            },
-          });
-        } else
-          edit({ inputError: 'No content found at this link' });
+        edit(
+          page.content.markdown ? { input: { mimeType: 'text/markdown', data: page.content.markdown, dataSize: page.content.markdown.length } }
+            : page.content.text ? { input: { mimeType: 'text/plain', data: page.content.text, dataSize: page.content.text.length } }
+              : page.content.html ? { input: { mimeType: 'text/html', data: page.content.html, dataSize: page.content.html.length } }
+                : { inputError: 'No content found at this link' },
+        );
      } catch (error: any) {
        edit({ inputError: `Issue downloading page: ${error?.message || (typeof error === 'string' ? error : JSON.stringify(error))}` });
      }
@@ -280,6 +280,7 @@ export function ChatMessage(props: {
  const wasEdited = !!messageUpdated;

  const textSel = selText ? selText : messageText;
+  // WARNING: if you get an issue here, you're downgrading from the new Big-AGI 2 data format to 1.x.
  const isSpecialT2I = textSel.startsWith('https://images.prodia.xyz/') || textSel.startsWith('/draw ') || textSel.startsWith('/imagine ') || textSel.startsWith('/img ');
  const couldDiagram = textSel.length >= 100 && !isSpecialT2I;
  const couldImagine = textSel.length >= 3 && !isSpecialT2I;
@@ -15,7 +15,8 @@ export const runBrowseGetPageUpdatingState = async (cHandler: ConversationHandle

  try {
    const page = await callBrowseFetchPage(url);
-    cHandler.messageEdit(assistantMessageId, { text: page.content || 'Issue: page load did not produce an answer: no text found', typing: false }, true);
+    const pageContent = page.content.markdown || page.content.text || page.content.html || 'Issue: page load did not produce an answer: no text found';
+    cHandler.messageEdit(assistantMessageId, { text: pageContent, typing: false }, true);
    return true;
  } catch (error: any) {
    console.error(error);
@@ -2,7 +2,7 @@ import type { DLLMId } from '~/modules/llms/store-llms';
 import type { StreamingClientUpdate } from '~/modules/llms/vendors/unifiedStreamingClient';
 import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
 import { conversationAutoTitle } from '~/modules/aifn/autotitle/autoTitle';
-import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
+import { llmStreamingChatGenerate, VChatContextRef, VChatMessageIn, VChatStreamContextName } from '~/modules/llms/llm.client';
 import { speakText } from '~/modules/elevenlabs/elevenlabs.client';

 import type { DMessage } from '~/common/state/store-chats';
@@ -34,6 +34,8 @@ export async function runAssistantUpdatingState(conversationId: string, history:
  const messageStatus = await streamAssistantMessage(
    assistantLlmId,
    history.map((m): VChatMessageIn => ({ role: m.role, content: m.text })),
+    'conversation',
+    conversationId,
    parallelViewCount,
    autoSpeak,
    (update) => cHandler.messageEdit(assistantMessageId, update, false),
@@ -61,6 +63,8 @@ type StreamMessageStatus = { outcome: StreamMessageOutcome, errorMessage?: strin
 export async function streamAssistantMessage(
  llmId: DLLMId,
  messagesHistory: VChatMessageIn[],
+  contextName: VChatStreamContextName,
+  contextRef: VChatContextRef,
  throttleUnits: number, // 0: disable, 1: default throttle (12Hz), 2+ reduce the message frequency with the square root
  autoSpeak: ChatAutoSpeakType,
  editMessage: (update: Partial<DMessage>) => void,
@@ -92,7 +96,7 @@ export async function streamAssistantMessage(
  const incrementalAnswer: Partial<DMessage> = { text: '' };

  try {
-    await llmStreamingChatGenerate(llmId, messagesHistory, null, null, abortSignal, (update: StreamingClientUpdate) => {
+    await llmStreamingChatGenerate(llmId, messagesHistory, contextName, contextRef, null, null, abortSignal, (update: StreamingClientUpdate) => {
      const textSoFar = update.textSoFar;

      // grow the incremental message
@@ -14,6 +14,7 @@ import { capitalizeFirstLetter } from '~/common/util/textUtils';

 import { NewsItems } from './news.data';
 import { beamNewsCallout } from './beam.data';
+import { bigAgi2NewsCallout } from './bigAgi2.data';


 // number of news items to show by default, before the expander
@@ -110,6 +111,13 @@ export function AppNews() {
            const addPadding = false; //!firstCard; // || showExpander;
            return <React.Fragment key={idx}>

+              {/* Inject the Big-AGI 2.0 item here*/}
+              {idx === 0 && (
+                <Box sx={{ mb: 3 }}>
+                  {bigAgi2NewsCallout}
+                </Box>
+              )}
+
              {/* Inject the Beam item here*/}
              {idx === 2 && (
                <Box sx={{ mb: 3 }}>
@@ -2,7 +2,6 @@ import * as React from 'react';

 import { Button, Card, CardContent, Grid, Typography } from '@mui/joy';
 import LaunchIcon from '@mui/icons-material/Launch';
-import ThumbUpRoundedIcon from '@mui/icons-material/ThumbUpRounded';

 import { Link } from '~/common/components/Link';

@@ -0,0 +1,40 @@
+import * as React from 'react';
+
+import { Button, Card, CardContent, Grid, Typography } from '@mui/joy';
+import AccessTimeIcon from '@mui/icons-material/AccessTime';
+import LaunchIcon from '@mui/icons-material/Launch';
+
+import { Link } from '~/common/components/Link';
+
+
+const bigAgi2SurveyUrl = 'https://y2rjg0zillz.typeform.com/to/ZSADpr5u?utm_source=gh-stable&utm_medium=news&utm_campaign=ea2';
+
+export const bigAgi2NewsCallout =
+  <Card variant='solid' invertedColors>
+    <CardContent sx={{ gap: 2 }}>
+      <Typography level='title-lg'>
+        Big-AGI 2.0 - In Development
+      </Typography>
+      <Typography level='body-sm'>
+        We&apos;re building the next version of Big-AGI with your needs in mind. New features, better performance, enhanced AI interactions. Help us shape it.
+      </Typography>
+      <Grid container spacing={1}>
+        <Grid xs={12} sm={7}>
+          <Button
+            fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
+            component={Link} href={bigAgi2SurveyUrl} noLinkStyle target='_blank'
+          >
+            Apply for Early Access
+          </Button>
+        </Grid>
+        <Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
+          <Button
+            fullWidth variant='outlined' color='primary' startDecorator={<AccessTimeIcon />}
+            disabled
+          >
+            Coming Fall 2024
+          </Button>
+        </Grid>
+      </Grid>
+    </CardContent>
+  </Card>;
@@ -52,17 +52,19 @@ interface NewsItem {
 // news and feature surfaces
 export const NewsItems: NewsItem[] = [
  /*{
-    versionCode: '1.16.0',
+    versionCode: '1.17.0',
    items: [
+      Screen Capture (when removed from labs)
+      Auto-Merge
      Draw
      ...
-      Screen Capture (when removed from labs)
    ]
  }*/
  {
-    versionCode: '1.16.1',
+    versionCode: '1.16.9',
    versionName: 'Crystal Clear',
-    versionDate: new Date('2024-05-13T19:00:00Z'),
+    versionDate: new Date('2024-06-07T05:00:00Z'),
+    // versionDate: new Date('2024-05-13T19:00:00Z'),
    // versionDate: new Date('2024-05-09T00:00:00Z'),
    versionCoverImage: coverV116,
    items: [
@@ -75,7 +77,16 @@ export const NewsItems: NewsItem[] = [
      { text: <>More: <B issue={517}>code soft-wrap</B>, selection toolbar, <B issue={507}>3x faster</B> on Apple silicon</>, issue: 507 },
      { text: <>Updated <B>Anthropic</B>*, <B>Groq</B>, <B>Ollama</B>, <B>OpenAI</B>*, <B>OpenRouter</B>*, and <B>Perplexity</B></> },
      { text: <>Developers: update LLMs data structures</>, dev: true },
-      { text: <>1.16.1: Support for <B>OpenAI</B> <B href='https://openai.com/index/hello-gpt-4o/'>GPT-4o</B> (refresh your OpenAI models)</> },
+      { text: <>1.16.1: Support for <B>OpenAI</B> <B href='https://openai.com/index/hello-gpt-4o/'>GPT-4o</B></> },
+      { text: <>1.16.2: Proper <B>Gemini</B> support, <B>HTML/Markdown</B> downloads, and latest <B>Mistral</B></> },
+      { text: <>1.16.3: Support for <B href='https://www.anthropic.com/news/claude-3-5-sonnet'>Claude 3.5 Sonnet</B> (refresh your <B>Anthropic</B> models)</> },
+      { text: <>1.16.4: <B>8192 tokens</B> support for Claude 3.5 Sonnet</> },
+      { text: <>1.16.5: OpenAI <B>GPT-4o Mini</B> support</> },
+      { text: <>1.16.6: Groq <B>Llama 3.1</B> support</> },
+      { text: <>1.16.7: Gpt-4o <B>2024-08-06</B></> },
+      { text: <>1.16.8: <B>ChatGPT-4o</B> latest</> },
+      { text: <>1.16.9: <B>Gemini</B> fixes</> },
+      { text: <>OpenAI <B>o1</B>, DeepSeek R1, and newer models require Big-AGI 2. <B href='https://y2rjg0zillz.typeform.com/to/ZSADpr5u?utm_source=gh-stable&utm_medium=news&utm_campaign=ea2'>Sign up here</B></> },
    ],
  },
  {
@@ -7,7 +7,7 @@ import { useAppStateStore } from '~/common/state/store-appstate';


 // update this variable every time you want to broadcast a new version to clients
-export const incrementalNewsVersion: number = 16.1;
+export const incrementalNewsVersion: number = 16.1; // not notifying for 1.16.9


 interface NewsState {
@@ -1,4 +1,5 @@
 import * as React from 'react';
+import { v4 as uuidv4 } from 'uuid';

 import { Alert, Box, Button, Card, CardContent, CircularProgress, Divider, FormLabel, Grid, IconButton, LinearProgress, Tab, tabClasses, TabList, TabPanel, Tabs, Typography } from '@mui/joy';
 import AddIcon from '@mui/icons-material/Add';
@@ -102,8 +103,11 @@ export function Creator(props: { display: boolean }) {
    strings: editedInstructions, stringEditors: instructionEditors,
  } = useFormEditTextArray(Prompts, PromptTitles);

-  const creationChainSteps = React.useMemo(() => {
-    return createChain(editedInstructions, PromptTitles);
+  const { steps: creationChainSteps, id: chainId } = React.useMemo(() => {
+    return {
+      steps: createChain(editedInstructions, PromptTitles),
+      id: uuidv4(),
+    };
  }, [editedInstructions]);

  const llmLabel = personaLlm?.label || undefined;
@@ -122,7 +126,7 @@ export function Creator(props: { display: boolean }) {
    chainError,
    userCancelChain,
    restartChain,
-  } = useLLMChain(creationChainSteps, personaLlm?.id, chainInputText ?? undefined, savePersona);
+  } = useLLMChain(creationChainSteps, personaLlm?.id, chainInputText ?? undefined, savePersona, 'persona-extract', chainId);


  // Reset the relevant state when the selected tab changes
@@ -200,7 +200,7 @@ export function SettingsModal(props: {

        <TabPanel value={PreferencesTab.Tools} variant='outlined' sx={{ p: 'var(--Tabs-gap)', borderRadius: 'md' }}>
          <Topics>
-            <Topic icon={<SearchIcon />} title='Browsing' startCollapsed>
+            <Topic icon={<SearchIcon />} title='Browsing'>
              <BrowseSettings />
            </Topic>
            <Topic icon={<SearchIcon />} title='Google Search API' startCollapsed>
@@ -0,0 +1,15 @@
+import * as React from 'react';
+
+import { Typography } from '@mui/joy';
+
+import CheckRoundedIcon from '@mui/icons-material/CheckRounded';
+
+
+export function AlreadySet(props: { required?: boolean }) {
+  return (
+    <Typography level='body-sm' startDecorator={props.required ? undefined : <CheckRoundedIcon color='success' />}>
+      {/*Installed Already*/}
+      {props.required ? 'required' : 'Already set on server'}
+    </Typography>
+  );
+}
@@ -5,7 +5,7 @@ import { v4 as uuidv4 } from 'uuid';

 import { DLLMId, getChatLLMId } from '~/modules/llms/store-llms';

-import { IDB_MIGRATION_INITIAL, idbStateStorage } from '../util/idbUtils';
+import { idbStateStorage } from '../util/idbUtils';
 import { countModelTokens } from '../util/token-counter';
 import { defaultSystemPurposeId, SystemPurposeId } from '../../data';

@@ -407,10 +407,7 @@ export const useChatStore = create<ConversationsStore>()(devtools(
      storage: createJSONStorage(() => idbStateStorage),

      // Migrations
-      migrate: (persistedState: unknown, fromVersion: number): ConversationsStore => {
-        // -1 -> 3: migration loading from localStorage to IndexedDB
-        if (fromVersion === IDB_MIGRATION_INITIAL)
-          return _migrateLocalStorageData() as any;
+      migrate: (persistedState: unknown, _fromVersion: number): ConversationsStore => {

        // other: just proceed
        return persistedState as any;
@@ -465,32 +462,6 @@ function getNextBranchTitle(currentTitle: string): string {
    return `(1) ${currentTitle}`;
 }

-/**
- * Returns the chats stored in the localStorage, and rename the key for
- * backup/data loss prevention purposes
- */
-function _migrateLocalStorageData(): ChatState | {} {
-  const key = 'app-chats';
-  const value = localStorage.getItem(key);
-  if (!value) return {};
-  try {
-    // parse the localStorage state
-    const localStorageState = JSON.parse(value)?.state;
-
-    // backup and delete the localStorage key
-    const backupKey = `${key}-v2`;
-    localStorage.setItem(backupKey, value);
-    localStorage.removeItem(key);
-
-    // match the state from localstorage
-    return {
-      conversations: localStorageState?.conversations ?? [],
-    };
-  } catch (error) {
-    console.error('LocalStorage migration error', error);
-    return {};
-  }
-}

 /**
 * Convenience function to count the tokens in a DMessage object
@@ -1,10 +1,6 @@
 import type { StateStorage } from 'zustand/middleware';
 import { del as idbDel, get as idbGet, set as idbSet } from 'idb-keyval';

-// used by the state storage middleware to detect data migration from the old state storage (localStorage)
-// NOTE: remove past 2024-03-19 (6 months past release of this utility conversion)
-export const IDB_MIGRATION_INITIAL = -1;
-

 // set to true to enable debugging
 const DEBUG_SCHEDULER = false;
@@ -130,17 +126,6 @@ export const idbStateStorage: StateStorage = {
    if (DEBUG_SCHEDULER)
      console.warn('   (read bytes:', value?.length?.toLocaleString(), ')');

-    /* IMPORTANT!
-     * We modify the default behavior of `getItem` to return a {version: -1} object if a key is not found.
-     * This is to trigger the migration across state storage implementations, as Zustand would not call the
-     * 'migrate' function otherwise.
-     * See 'https://github.com/enricoros/big-agi/pull/158' for more details
-     */
-    if (value === undefined) {
-      return JSON.stringify({
-        version: IDB_MIGRATION_INITIAL,
-      });
-    }
    return value || null;
  },
  setItem: (name: string, value: string): void => {
@@ -8,8 +8,11 @@ export function prettyBaseModel(model: string | undefined): string {
  if (!model) return '';
  if (model.includes('gpt-4-vision-preview')) return 'GPT-4 Vision';
  if (model.includes('gpt-4-1106-preview')) return 'GPT-4 Turbo';
-  if (model.includes('gpt-4-32k')) return 'gpt-4-32k';
-  if (model.includes('gpt-4')) return 'gpt-4';
+  if (model.includes('gpt-4-32k')) return 'GPT-4-32k';
+  if (model.includes('gpt-4o-mini')) return 'GPT-4o Mini';
+  if (model.includes('gpt-4o')) return 'GPT-4o';
+  if (model.includes('gpt-4-turbo')) return 'GPT-4 Turbo';
+  if (model.includes('gpt-4')) return 'GPT-4';
  if (model.includes('gpt-3.5-turbo-instruct')) return '3.5 Turbo Instruct';
  if (model.includes('gpt-3.5-turbo-1106')) return '3.5 Turbo 16k';
  if (model.includes('gpt-3.5-turbo-16k')) return '3.5 Turbo 16k';
@@ -6,14 +6,20 @@ import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
 // Do not set this to true in production, it's very verbose
 const DEBUG_TOKEN_COUNT = false;

+// Globals
+// const tokenEncodings: string[] = ['gpt2', 'r50k_base', 'p50k_base', 'p50k_edit', 'cl100k_base', 'o200k_base'] satisfies TiktokenEncoding[];

-// global symbols to dynamically load the Tiktoken library
+// Global symbols to dynamically load the Tiktoken library
 let get_encoding: ((encoding: TiktokenEncoding) => Tiktoken) | null = null;
 let encoding_for_model: ((model: TiktokenModel) => Tiktoken) | null = null;
 let preloadPromise: Promise<void> | null = null;
 let informTheUser = false;

-export function preloadTiktokenLibrary() {
+/**
+ * Preloads the Tiktoken library if not already loaded.
+ * @returns {Promise<void>} A promise that resolves when the library is loaded.
+ */
+export function preloadTiktokenLibrary(): Promise<void> {
  if (!preloadPromise) {
    preloadPromise = import('tiktoken')
      .then(tiktoken => {
@@ -33,16 +39,21 @@ export function preloadTiktokenLibrary() {


 /**
- * Wrapper around the Tiktoken library, to keep tokenizers for all models in a cache
- *
- * We also preload the tokenizer for the default model, so that the first time a user types
- * a message, it doesn't stall loading the tokenizer.
+ * Wrapper around the Tiktoken library to keep tokenizers for all models in a cache.
+ * Also, preloads the tokenizer for the default model to avoid initial stall.
 */
 export const countModelTokens: (text: string, llmId: DLLMId, debugFrom: string) => number | null = (() => {
  // return () => 0;
  const tokenEncoders: { [modelId: string]: Tiktoken } = {};
-  let encodingCL100K: Tiktoken | null = null;
+  let encodingDefault: Tiktoken | null = null;

+  /**
+   * Counts the tokens in the given text for the specified model.
+   * @param {string} text - The text to tokenize.
+   * @param {DLLMId} llmId - The ID of the LLM.
+   * @param {string} debugFrom - Debug information.
+   * @returns {number | null} The token count or null if not ready.
+   */
  function _tokenCount(text: string, llmId: DLLMId, debugFrom: string): number | null {

    // The library shall have been preloaded - if not, attempt to start its loading and return null to indicate we're not ready to count
@@ -55,21 +66,23 @@ export const countModelTokens: (text: string, llmId: DLLMId, debugFrom: string)
      return null;
    }

-    const { options: { llmRef: openaiModel } } = findLLMOrThrow(llmId);
+    const openaiModel = findLLMOrThrow(llmId)?.options?.llmRef;
    if (!openaiModel) throw new Error(`LLM ${llmId} has no LLM reference id`);
+
    if (!(openaiModel in tokenEncoders)) {
      try {
        tokenEncoders[openaiModel] = encoding_for_model(openaiModel as TiktokenModel);
      } catch (e) {
-        // make sure we recycle the default encoding across all models
-        if (!encodingCL100K)
-          encodingCL100K = get_encoding('cl100k_base');
-        tokenEncoders[openaiModel] = encodingCL100K;
+        // fallback to the default encoding across all models (not just OpenAI - this will be used everywhere..)
+        if (!encodingDefault)
+          encodingDefault = get_encoding('cl100k_base');
+        tokenEncoders[openaiModel] = encodingDefault;
      }
    }
-    let count: number = 0;
+
    // Note: the try/catch shouldn't be necessary, but there could be corner cases where the tiktoken library throws
    // https://github.com/enricoros/big-agi/issues/182
+    let count = 0;
    try {
      count = tokenEncoders[openaiModel]?.encode(text, 'all', [])?.length || 0;
    } catch (e) {
@@ -1,4 +1,4 @@
-import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
+import { llmChatGenerateOrThrow, VChatFunctionIn, VChatMessageIn } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';

 import { useChatStore } from '~/common/state/store-chats';
@@ -83,13 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri

  // Follow-up: Auto-Diagrams
  if (suggestDiagrams) {
-    void llmChatGenerateOrThrow(funcLLMId, [
-        { role: 'system', content: systemMessage.text },
-        { role: 'user', content: userMessage.text },
-        { role: 'assistant', content: assistantMessageText },
-      ], [suggestPlantUMLFn], 'draw_plantuml_diagram',
+    const instructions: VChatMessageIn[] = [
+      { role: 'system', content: systemMessage.text },
+      { role: 'user', content: userMessage.text },
+      { role: 'assistant', content: assistantMessageText },
+    ];
+    llmChatGenerateOrThrow(
+      funcLLMId,
+      instructions,
+      'chat-followup-diagram', conversationId,
+      [suggestPlantUMLFn], 'draw_plantuml_diagram',
    ).then(chatResponse => {
-
+      // cheap way to check if the function was supported
      if (!('function_arguments' in chatResponse))
        return;

@@ -110,7 +115,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
        }
      }
    }).catch(err => {
-      console.error('autoSuggestions::diagram:', err);
+      // Likely the model did not support function calling
+      // console.log('autoSuggestions: diagram error:', err);
    });
  }

@@ -1,5 +1,5 @@
 import { getFastLLMId } from '~/modules/llms/store-llms';
-import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';

 import { useChatStore } from '~/common/state/store-chats';

@@ -34,21 +34,23 @@ export async function conversationAutoTitle(conversationId: string, forceReplace

  try {
    // LLM chat-generate call
+    const instructions: VChatMessageIn[] = [
+      { role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
+      {
+        role: 'user', content:
+          'Analyze the given short conversation (every line is truncated) and extract a concise chat title that ' +
+          'summarizes the conversation in as little as a couple of words.\n' +
+          'Only respond with the lowercase short title and nothing else.\n' +
+          '\n' +
+          '```\n' +
+          historyLines.join('\n') +
+          '```\n',
+      },
+    ];
    const chatResponse = await llmChatGenerateOrThrow(
      fastLLMId,
-      [
-        { role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
-        {
-          role: 'user', content:
-            'Analyze the given short conversation (every line is truncated) and extract a concise chat title that ' +
-            'summarizes the conversation in as little as a couple of words.\n' +
-            'Only respond with the lowercase short title and nothing else.\n' +
-            '\n' +
-            '```\n' +
-            historyLines.join('\n') +
-            '```\n',
-        },
-      ],
+      instructions,
+      'chat-ai-title', conversationId,
      null, null,
    );

@@ -68,7 +68,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
  const [diagramLlm, llmComponent] = useFormRadioLlmType('Generator', 'chat');

  // derived state
-  const { conversationId, text: subject } = props.config;
+  const { conversationId, messageId, text: subject } = props.config;
  const diagramLlmId = diagramLlm?.id;


@@ -98,7 +98,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
    const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);

    try {
-      await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
+      await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, 'ai-diagram', messageId, null, null, stepAbortController.signal,
        ({ textSoFar }) => textSoFar && setDiagramCode(diagramCode = textSoFar),
      );
    } catch (error: any) {
@@ -109,7 +109,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
      setAbortController(null);
    }

-  }, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);
+  }, [abortController, conversationId, customInstruction, diagramLanguage, diagramLlm, diagramType, messageId, subject]);


  // [Effect] Auto-abort on unmount
@@ -117,7 +117,7 @@ export function FlattenerModal(props: {
    await startStreaming(llm.id, [
      { role: 'system', content: flattenProfile.systemPrompt },
      { role: 'user', content: encodeConversationAsUserMessage(flattenProfile.userPrompt, messages) },
-    ]);
+    ], 'ai-flattener', messages[0].id);

  }, [llm, props.conversationId, startStreaming]);

@@ -1,5 +1,5 @@
 import { getFastLLMId } from '~/modules/llms/store-llms';
-import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 const simpleImagineSystemPrompt =
@@ -10,14 +10,15 @@ Provide output as a lowercase prompt and nothing else.`;
 /**
 * Creates a caption for a drawing or photo given some description - used to elevate the quality of the imaging
 */
-export async function imaginePromptFromText(messageText: string): Promise<string | null> {
+export async function imaginePromptFromText(messageText: string, contextRef: string): Promise<string | null> {
  const fastLLMId = getFastLLMId();
  if (!fastLLMId) return null;
  try {
-    const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
+    const instructions: VChatMessageIn[] = [
      { role: 'system', content: simpleImagineSystemPrompt },
      { role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
-    ], null, null);
+    ];
+    const chatResponse = await llmChatGenerateOrThrow(fastLLMId, instructions, 'draw-expand-prompt', contextRef, null, null);
    return chatResponse.content?.trim() ?? null;
  } catch (error: any) {
    console.error('imaginePromptFromText: fetch request error:', error);
@@ -132,7 +132,7 @@ export class Agent {
    S.messages.push({ role: 'user', content: prompt });
    let content: string;
    try {
-      content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
+      content = (await llmChatGenerateOrThrow(llmId, S.messages, 'chat-react-turn', null, null, null, 500)).content;
    } catch (error: any) {
      content = `Error in llmChatGenerateOrThrow: ${error}`;
    }
@@ -194,7 +194,8 @@ async function search(query: string): Promise<string> {
 async function browse(url: string): Promise<string> {
  try {
    const page = await callBrowseFetchPage(url);
-    return JSON.stringify(page.content ? { text: page.content } : { error: 'Issue reading the page' });
+    const pageContent = page.content.markdown || page.content.text || page.content.html || '';
+    return JSON.stringify(pageContent ? { text: pageContent } : { error: 'Issue reading the page' });
  } catch (error) {
    console.error('Error browsing:', (error as Error).message);
    return 'An error occurred while browsing to the URL. Missing WSS Key?';
@@ -1,5 +1,5 @@
 import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
-import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 // prompt to be tried when doing recursive summerization.
@@ -80,10 +80,11 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
  const autoResponseTokensSize = contextTokens ? Math.floor(contextTokens * outputTokenShare) : null;

  try {
-    const chatResponse = await llmChatGenerateOrThrow(llmId, [
+    const instructions: VChatMessageIn[] = [
      { role: 'system', content: cleanupPrompt },
      { role: 'user', content: chunk },
-    ], null, null, autoResponseTokensSize ?? undefined);
+    ];
+    const chatResponse = await llmChatGenerateOrThrow(llmId, instructions, 'chat-ai-summarize', null, null, null, autoResponseTokensSize ?? undefined);
    return chatResponse?.content ?? '';
  } catch (error: any) {
    return '';
@@ -1,7 +1,7 @@
 import * as React from 'react';

 import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
-import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
+import { llmStreamingChatGenerate, VChatContextRef, VChatMessageIn, VChatStreamContextName } from '~/modules/llms/llm.client';


 // set to true to log to the console
@@ -20,7 +20,7 @@ export interface LLMChainStep {
 /**
 * React hook to manage a chain of LLM transformations.
 */
-export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, chainInput: string | undefined, onSuccess?: (output: string, input: string) => void) {
+export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, chainInput: string | undefined, onSuccess: (output: string, input: string) => void, contextName: VChatStreamContextName, contextRef: VChatContextRef) {

  // state
  const [chain, setChain] = React.useState<ChainState | null>(null);
@@ -114,7 +114,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
    setChainStepInterimText(null);

    // LLM call (streaming, cancelable)
-    llmStreamingChatGenerate(llmId, llmChatInput, null, null, stepAbortController.signal,
+    llmStreamingChatGenerate(llmId, llmChatInput, contextName, contextRef, null, null, stepAbortController.signal,
      ({ textSoFar }) => {
        textSoFar && setChainStepInterimText(interimText = textSoFar);
      })
@@ -141,7 +141,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
        stepAbortController.abort('step aborted');
      _chainAbortController.signal.removeEventListener('abort', globalToStepListener);
    };
-  }, [chain, llmId, onSuccess]);
+  }, [chain, contextRef, contextName, llmId, onSuccess]);


  return {
@@ -1,7 +1,7 @@
 import * as React from 'react';

 import type { DLLMId } from '~/modules/llms/store-llms';
-import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
+import { llmStreamingChatGenerate, VChatContextRef, VChatMessageIn, VChatStreamContextName } from '~/modules/llms/llm.client';


 export function useStreamChatText() {
@@ -13,7 +13,7 @@ export function useStreamChatText() {
  const abortControllerRef = React.useRef<AbortController | null>(null);


-  const startStreaming = React.useCallback(async (llmId: DLLMId, prompt: VChatMessageIn[]) => {
+  const startStreaming = React.useCallback(async (llmId: DLLMId, prompt: VChatMessageIn[], contextName: VChatStreamContextName, contextRef: VChatContextRef) => {
    setStreamError(null);
    setPartialText(null);
    setText(null);
@@ -24,7 +24,7 @@ export function useStreamChatText() {

    try {
      let lastText = '';
-      await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, ({ textSoFar }) => {
+      await llmStreamingChatGenerate(llmId, prompt, contextName, contextRef, null, null, abortControllerRef.current.signal, ({ textSoFar }) => {
        if (textSoFar) {
          lastText = textSoFar;
          setPartialText(lastText);
@@ -30,32 +30,30 @@ export function BeamView(props: {

  // external state
  const { novel: explainerUnseen, touch: explainerCompleted, forget: explainerShow } = useUICounter('beam-wizard');
+  const gatherAutoStartAfterScatter = useModuleBeamStore(state => state.gatherAutoStartAfterScatter);
  const {
    /* root */ editInputHistoryMessage,
    /* scatter */ setRayCount, startScatteringAll, stopScatteringAll,
  } = props.beamStore.getState();
  const {
    /* root */ inputHistory, inputIssues, inputReady,
-    /* scatter */ isScattering, raysReady,
+    /* scatter */ hadImportedRays, isScattering, raysReady,
    /* gather (composite) */ canGather,
-    /* IDs */ rayIds, fusionIds,
  } = useBeamStore(props.beamStore, useShallow(state => ({
    // input
    inputHistory: state.inputHistory,
    inputIssues: state.inputIssues,
    inputReady: state.inputReady,
    // scatter
+    hadImportedRays: state.hadImportedRays,
    isScattering: state.isScattering,
    raysReady: state.raysReady,
    // gather (composite)
    canGather: state.raysReady >= 2 && state.currentFactoryId !== null && state.currentGatherLlmId !== null,
-    // IDs
-    rayIds: state.rays.map(ray => ray.rayId),
-    fusionIds: state.fusions.map(fusion => fusion.fusionId),
-  })));
-  const { gatherAutoStartAfterScatter } = useModuleBeamStore(useShallow(state => ({
-    gatherAutoStartAfterScatter: state.gatherAutoStartAfterScatter,
  })));
+  // the following are independent because of useShallow, which would break in the above call
+  const rayIds = useBeamStore(props.beamStore, useShallow(state => state.rays.map(ray => ray.rayId)));
+  const fusionIds = useBeamStore(props.beamStore, useShallow(state => state.fusions.map(fusion => fusion.fusionId)));

  // derived state
  const raysCount = rayIds.length;
@@ -173,6 +171,7 @@ export function BeamView(props: {
        beamStore={props.beamStore}
        isMobile={props.isMobile}
        rayIds={rayIds}
+        hadImportedRays={hadImportedRays}
        onIncreaseRayCount={handleRayIncreaseCount}
        // linkedLlmId={currentGatherLlmId}
      />
@@ -13,6 +13,7 @@ import { BeamStoreApi, useBeamStore } from '../store-beam.hooks';
 import { FFactoryId, FUSION_FACTORIES } from './instructions/beam.gather.factories';
 import { GATHER_COLOR } from '../beam.config';
 import { beamPaneSx } from '../BeamCard';
+import { useModuleBeamStore } from '../store-module-beam';


 const gatherPaneClasses = {
@@ -79,8 +80,9 @@ export function BeamGatherPane(props: {
    setCurrentFactoryId: state.setCurrentFactoryId,
    setCurrentGatherLlmId: state.setCurrentGatherLlmId,
  })));
+  const gatherAutoStartAfterScatter = useModuleBeamStore(state => state.gatherAutoStartAfterScatter);
  const [_, gatherLlmComponent/*, gatherLlmIcon*/] = useLLMSelect(
-    currentGatherLlmId, setCurrentGatherLlmId, props.isMobile ? '' : 'Merge Model', true, !props.canGather,
+    currentGatherLlmId, setCurrentGatherLlmId, props.isMobile ? '' : 'Merge Model', true, !props.canGather && !gatherAutoStartAfterScatter,
  );

  // derived state
@@ -170,23 +170,24 @@ export function Fusion(props: {
          <GoodTooltip title='Use this message'>
            <IconButton
              size='sm'
-              // variant='solid'
+              // variant='plain'
              color={GATHER_COLOR}
              disabled={isFusing}
              onClick={handleFusionUse}
              // endDecorator={<TelegramIcon />}
              sx={{
                // ...BEAM_BTN_SX,
-                // fontSize: 'xs',
+                fontSize: 'xs',
+                // '--Icon-fontSize': 'var(--joy-fontSize-xl)',
                // backgroundColor: 'background.popup',
                // border: '1px solid',
                // borderColor: `${GATHER_COLOR}.outlinedBorder`,
                // boxShadow: `0 4px 16px -4px rgb(var(--joy-palette-${GATHER_COLOR}-mainChannel) / 20%)`,
                animation: `${animationEnterBelow} 0.1s ease-out`,
-                // whiteSpace: 'nowrap',
+                whiteSpace: 'nowrap',
              }}
            >
-              {/*Ok*/}
+              {/*Use*/}
              <TelegramIcon />
            </IconButton>
          </GoodTooltip>
@@ -96,7 +96,7 @@ export async function executeChatGenerate(_i: ChatGenerateInstruction, inputs: E
  };

  // LLM Streaming generation
-  return streamAssistantMessage(inputs.llmId, history, getUXLabsHighPerformance() ? 0 : 1, 'off', onMessageUpdate, inputs.chainAbortController.signal)
+  return streamAssistantMessage(inputs.llmId, history, 'beam-gather', inputs.contextRef, getUXLabsHighPerformance() ? 0 : 1, 'off', onMessageUpdate, inputs.chainAbortController.signal)
    .then((status) => {
      // re-throw errors, as streamAssistantMessage catches internally
      if (status.outcome === 'aborted') {
@@ -23,6 +23,7 @@ export interface ExecutionInputState {
  readonly chatMessages: DMessage[];
  readonly rayMessages: DMessage[];
  readonly llmId: DLLMId;
+  readonly contextRef: string; // not useful
  // interaction
  readonly chainAbortController: AbortController;
  readonly updateProgressComponent: (component: React.ReactNode) => void;
@@ -67,6 +68,7 @@ export function gatherStartFusion(
    chatMessages: chatMessages,
    rayMessages: rayMessages,
    llmId: initialFusion.llmId,
+    contextRef: initialFusion.fusionId,
    // interaction
    chainAbortController: new AbortController(),
    updateProgressComponent: (component: React.ReactNode) => onUpdateBFusion({ fusingProgressComponent: component }),
@@ -16,6 +16,7 @@ import type { DLLMId } from '~/modules/llms/store-llms';

 import { GoodTooltip } from '~/common/components/GoodTooltip';
 import { InlineError } from '~/common/components/InlineError';
+import { animationEnterBelow } from '~/common/util/animUtils';
 import { copyToClipboard } from '~/common/util/clipboardUtils';
 import { useLLMSelect } from '~/common/components/forms/useLLMSelect';

@@ -109,7 +110,8 @@ function RayControls(props: {

 export function BeamRay(props: {
  beamStore: BeamStoreApi,
-  isRemovable: boolean
+  hadImportedRays: boolean
+  isRemovable: boolean,
  rayId: string,
  rayIndexWeak: number,
  // linkedLlmId: DLLMId | null,
@@ -240,16 +242,20 @@ export function BeamRay(props: {
            <GoodTooltip title='Choose this message'>
              <IconButton
                size='sm'
+                // variant='plain'
                color={GATHER_COLOR}
                disabled={isImported || isScattering}
                onClick={handleRayUse}
+                // endDecorator={!isImported ? <TelegramIcon /> : null}
                sx={{
                  fontSize: 'xs',
+                  // '--Icon-fontSize': 'var(--joy-fontSize-xl)',
                  px: isImported ? 1 : undefined,
+                  animation: `${animationEnterBelow} 0.1s ease-out`,
                  whiteSpace: 'nowrap',
                }}
              >
-                {isImported ? 'From Chat' : /*'Use'*/ <TelegramIcon />}
+                {isImported ? 'From Chat' : /*props.hadImportedRays ? 'Replace' : 'Use'*/ <TelegramIcon />}
              </IconButton>
            </GoodTooltip>

@@ -27,6 +27,7 @@ const rayGridMobileSx: SxProps = {

 export function BeamRayGrid(props: {
  beamStore: BeamStoreApi,
+  hadImportedRays: boolean
  isMobile: boolean,
  onIncreaseRayCount: () => void,
  rayIds: string[],
@@ -44,6 +45,7 @@ export function BeamRayGrid(props: {
          key={'ray-' + rayId}
          rayIndexWeak={index}
          beamStore={props.beamStore}
+          hadImportedRays={props.hadImportedRays}
          isRemovable={raysCount > SCATTER_RAY_MIN}
          rayId={rayId}
          // linkedLlmId={props.linkedLlmId}
@@ -67,7 +67,7 @@ function rayScatterStart(ray: BRay, llmId: DLLMId | null, inputHistory: DMessage

  // stream the assistant's messages
  const messagesHistory: VChatMessageIn[] = inputHistory.map(({ role, text }) => ({ role, content: text }));
-  streamAssistantMessage(llmId, messagesHistory, getUXLabsHighPerformance() ? 0 : rays.length, 'off', updateMessage, abortController.signal)
+  streamAssistantMessage(llmId, messagesHistory, 'beam-scatter', ray.rayId, getUXLabsHighPerformance() ? 0 : rays.length, 'off', updateMessage, abortController.signal)
    .then((status) => {
      _rayUpdate(ray.rayId, {
        status: (status.outcome === 'success') ? 'success'
@@ -134,6 +134,7 @@ export function rayIsImported(ray: BRay | null): boolean {
 interface ScatterStateSlice {

  rays: BRay[];
+  hadImportedRays: boolean;

  // derived state
  isScattering: boolean; // true if any ray is scattering at the moment
@@ -148,6 +149,7 @@ export const reInitScatterStateSlice = (prevRays: BRay[]): ScatterStateSlice =>
  return {
    // (remember) keep the same quantity of rays and same llms
    rays: prevRays.map(prevRay => createBRay(prevRay.rayLlmId)),
+    hadImportedRays: false,

    isScattering: false,
    raysReady: 0,
@@ -238,6 +240,7 @@ export const createScatterSlice: StateCreator<RootStoreSlice & ScatterStoreSlice
        // append the other rays (excluding the ones to remove)
        ...rays.filter((ray) => !raysToRemove.includes(ray)),
      ],
+      hadImportedRays: messages.length > 0,
    });
    _storeLastScatterConfig();
    _syncRaysStateToScatter();
@@ -70,7 +70,7 @@ const createRootSlice: StateCreator<BeamStore, [], [], RootStoreSlice> = (_set,


  open: (chatHistory: Readonly<DMessage[]>, initialChatLlmId: DLLMId | null, callback: BeamSuccessCallback) => {
-    const { isOpen: wasAlreadyOpen, terminateKeepingSettings, loadBeamConfig, setRayLlmIds, setCurrentGatherLlmId } = _get();
+    const { isOpen: wasAlreadyOpen, terminateKeepingSettings, loadBeamConfig, hadImportedRays, setRayLlmIds, setCurrentGatherLlmId } = _get();

    // reset pending operations
    terminateKeepingSettings();
@@ -89,6 +89,7 @@ const createRootSlice: StateCreator<BeamStore, [], [], RootStoreSlice> = (_set,
      onSuccessCallback: callback,

      // rays already reset
+      hadImportedRays,

      // update the model only if the dialog was not already open
      ...(!wasAlreadyOpen && initialChatLlmId && {
@@ -1,10 +1,12 @@
 import * as React from 'react';
-import { shallow } from 'zustand/shallow';
+import { useShallow } from 'zustand/react/shallow';

-import { Checkbox, FormControl, FormHelperText } from '@mui/joy';
+import { Checkbox, FormControl, FormHelperText, Option, Select, Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
+import { ExternalLink } from '~/common/components/ExternalLink';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
-import { Link } from '~/common/components/Link';
+import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { platformAwareKeystrokes } from '~/common/components/KeyStroke';

 import { useBrowseCapability, useBrowseStore } from './store-module-browsing';
@@ -13,50 +15,82 @@ import { useBrowseCapability, useBrowseStore } from './store-module-browsing';
 export function BrowseSettings() {

  // external state
-  const { mayWork, isServerConfig, isClientValid, inCommand, inComposer, inReact } = useBrowseCapability();
-  const { wssEndpoint, setWssEndpoint, setEnableCommandBrowse, setEnableComposerAttach, setEnableReactTool } = useBrowseStore(state => ({
+  const { mayWork, isServerConfig, isClientValid, inCommand, inComposer, inReact, inPersonas } = useBrowseCapability();
+  const {
+    wssEndpoint, setWssEndpoint,
+    pageTransform, setPageTransform,
+    setEnableCommandBrowse, setEnableComposerAttach, setEnableReactTool, setEnablePersonaTool,
+  } = useBrowseStore(useShallow(state => ({
    wssEndpoint: state.wssEndpoint,
+    pageTransform: state.pageTransform,
+    setPageTransform: state.setPageTransform,
    setWssEndpoint: state.setWssEndpoint,
    setEnableCommandBrowse: state.setEnableCommandBrowse,
    setEnableComposerAttach: state.setEnableComposerAttach,
    setEnableReactTool: state.setEnableReactTool,
-  }), shallow);
+    setEnablePersonaTool: state.setEnablePersonaTool,
+  })));
+
+  const handlePageTransformChange = (_event: any, value: typeof pageTransform | null) => value && setPageTransform(value);
+

  return <>

-    <FormHelperText sx={{ display: 'block' }}>
-      Configure a browsing service to enable loading links and pages. See the <Link
-      href='https://github.com/enricoros/big-agi/blob/main/docs/config-feature-browse.md' target='_blank' noLinkStyle>
-      browse configuration guide</Link> for more information.
-    </FormHelperText>
+    <Typography level='body-sm'>
+      Configure Browsing to enable loading links and web pages. <ExternalLink
+      href='https://github.com/enricoros/big-agi/blob/main/docs/config-feature-browse.md'>
+      Learn more</ExternalLink>.
+    </Typography>

    <FormInputKey
-      autoCompleteId='browse-wss' label='Puppeteer Endpoint' noKey
+      autoCompleteId='browse-wss' label='Puppeteer Wss' noKey
      value={wssEndpoint} onChange={setWssEndpoint}
-      rightLabel={!isServerConfig ? 'required' : '✔️ already set in server'}
+      rightLabel={<AlreadySet required={!isServerConfig} />}
      required={!isServerConfig} isError={!isClientValid && !isServerConfig}
      placeholder='wss://...'
    />

+
+    <FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
+      <FormLabelStart title='Load pages as:' />
+      <Select
+        variant='outlined'
+        value={pageTransform} onChange={handlePageTransformChange}
+        slotProps={{
+          root: { sx: { minWidth: '140px' } },
+          indicator: { sx: { opacity: 0.5 } },
+          button: { sx: { whiteSpace: 'inherit' } },
+        }}
+      >
+        <Option value='text'>Text (default)</Option>
+        <Option value='markdown'>Markdown</Option>
+        <Option value='html'>HTML</Option>
+      </Select>
+    </FormControl>
+
+
+    <Typography level='body-sm' sx={{ mt: 2 }}>Browsing enablement:</Typography>
+
    <FormControl disabled={!mayWork}>
-      <Checkbox variant='outlined' label='Attach URLs' checked={inComposer} onChange={(event) => setEnableComposerAttach(event.target.checked)} />
-      <FormHelperText>{platformAwareKeystrokes('Load and attach a page when pasting a URL')}</FormHelperText>
+      <Checkbox size='sm' label='Paste URLs' checked={inComposer} onChange={(event) => setEnableComposerAttach(event.target.checked)} />
+      <FormHelperText>{platformAwareKeystrokes('Load and attach when pasting a URL')}</FormHelperText>
    </FormControl>

    <FormControl disabled={!mayWork}>
-      <Checkbox variant='outlined' label='/browse' checked={inCommand} onChange={(event) => setEnableCommandBrowse(event.target.checked)} />
+      <Checkbox size='sm' label='/browse' checked={inCommand} onChange={(event) => setEnableCommandBrowse(event.target.checked)} />
      <FormHelperText>{platformAwareKeystrokes('Use /browse to load a web page')}</FormHelperText>
    </FormControl>

    <FormControl disabled={!mayWork}>
-      <Checkbox variant='outlined' label='ReAct' checked={inReact} onChange={(event) => setEnableReactTool(event.target.checked)} />
+      <Checkbox size='sm' label='ReAct' checked={inReact} onChange={(event) => setEnableReactTool(event.target.checked)} />
      <FormHelperText>Enables loadURL() in ReAct</FormHelperText>
    </FormControl>

-    {/*<FormControl disabled>*/}
-    {/*  <Checkbox variant='outlined' label='Personas' checked={inPersonas} onChange={(event) => setEnablePersonaTool(event.target.checked)} />*/}
-    {/*  <FormHelperText>Enable loading URLs by Personas</FormHelperText>*/}
-    {/*</FormControl>*/}
+    <FormControl disabled>
+      <Checkbox size='sm' label='Chat with Personas' checked={false} onChange={(event) => setEnablePersonaTool(event.target.checked)} />
+      <FormHelperText>Not yet available</FormHelperText>
+      {/*<FormHelperText>Enable loading URLs by Personas</FormHelperText>*/}
+    </FormControl>

  </>;
 }
@@ -7,31 +7,39 @@ import { apiAsyncNode } from '~/common/util/trpc.client';
 const DEBUG_SHOW_SCREENSHOT = false;


-export async function callBrowseFetchPage(url: string) {
+// export function

-  // thow if no URL is provided
+export async function callBrowseFetchPage(
+  url: string,
+  // transforms?: BrowsePageTransform[],
+  // screenshotOptions?: { width: number, height: number, quality?: number },
+) {
+
+  // validate url
  url = url?.trim() || '';
  if (!url)
    throw new Error('Browsing error: Invalid URL');

-  // assume https if no protocol is provided
-  // noinspection HttpUrlsUsage
+  // noinspection HttpUrlsUsage: assume https if no protocol is provided
  if (!url.startsWith('http://') && !url.startsWith('https://'))
    url = 'https://' + url;

-  const clientWssEndpoint = useBrowseStore.getState().wssEndpoint;
+  const { wssEndpoint, pageTransform } = useBrowseStore.getState();

  const { pages } = await apiAsyncNode.browse.fetchPages.mutate({
    access: {
      dialect: 'browse-wss',
-      ...(!!clientWssEndpoint && { wssEndpoint: clientWssEndpoint }),
+      ...(!!wssEndpoint && { wssEndpoint }),
    },
-    subjects: [{ url }],
-    screenshot: DEBUG_SHOW_SCREENSHOT ? {
-      width: 512,
-      height: 512,
-      // quality: 100,
-    } : undefined,
+    requests: [{
+      url,
+      transforms: /*transforms ? transforms :*/ [pageTransform],
+      screenshot: /*screenshotOptions ? screenshotOptions :*/ !DEBUG_SHOW_SCREENSHOT ? undefined : {
+        width: 512,
+        height: 512,
+        // quality: 100,
+      },
+    }],
  });

  if (pages.length !== 1)
@@ -42,7 +50,7 @@ export async function callBrowseFetchPage(url: string) {
  // DEBUG: if there's a screenshot, append it to the dom
  if (DEBUG_SHOW_SCREENSHOT && page.screenshot) {
    const img = document.createElement('img');
-    img.src = page.screenshot.imageDataUrl;
+    img.src = page.screenshot.webpDataUrl;
    img.style.width = `${page.screenshot.width}px`;
    img.style.height = `${page.screenshot.height}px`;
    document.body.appendChild(img);
@@ -51,7 +59,7 @@ export async function callBrowseFetchPage(url: string) {
  // throw if there's an error
  if (page.error) {
    console.warn('Browsing service error:', page.error);
-    if (!page.content)
+    if (!Object.keys(page.content).length)
      throw new Error(page.error);
  }

@@ -1,6 +1,9 @@
 import { z } from 'zod';
 import { TRPCError } from '@trpc/server';
-import { BrowserContext, connect, ScreenshotOptions, TimeoutError } from '@cloudflare/puppeteer';
+
+import { BrowserContext, connect, ScreenshotOptions } from '@cloudflare/puppeteer';
+import { default as TurndownService } from 'turndown';
+import { load as cheerioLoad } from 'cheerio';

 import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
@@ -16,17 +19,22 @@ const browseAccessSchema = z.object({
  dialect: z.enum(['browse-wss']),
  wssEndpoint: z.string().trim().optional(),
 });
+type BrowseAccessSchema = z.infer<typeof browseAccessSchema>;
+
+const pageTransformSchema = z.enum(['html', 'text', 'markdown']);
+type PageTransformSchema = z.infer<typeof pageTransformSchema>;

 const fetchPageInputSchema = z.object({
  access: browseAccessSchema,
-  subjects: z.array(z.object({
+  requests: z.array(z.object({
    url: z.string().url(),
+    transforms: z.array(pageTransformSchema),
+    screenshot: z.object({
+      width: z.number(),
+      height: z.number(),
+      quality: z.number().optional(),
+    }).optional(),
  })),
-  screenshot: z.object({
-    width: z.number(),
-    height: z.number(),
-    quality: z.number().optional(),
-  }).optional(),
 });


@@ -34,16 +42,18 @@ const fetchPageInputSchema = z.object({

 const fetchPageWorkerOutputSchema = z.object({
  url: z.string(),
-  content: z.string(),
+  content: z.record(pageTransformSchema, z.string()),
  error: z.string().optional(),
  stopReason: z.enum(['end', 'timeout', 'error']),
  screenshot: z.object({
-    imageDataUrl: z.string().startsWith('data:image/'),
+    webpDataUrl: z.string().startsWith('data:image/webp'),
    mimeType: z.string().startsWith('image/'),
    width: z.number(),
    height: z.number(),
  }).optional(),
 });
+type FetchPageWorkerOutputSchema = z.infer<typeof fetchPageWorkerOutputSchema>;
+

 const fetchPagesOutputSchema = z.object({
  pages: z.array(fetchPageWorkerOutputSchema),
@@ -55,21 +65,23 @@ export const browseRouter = createTRPCRouter({
  fetchPages: publicProcedure
    .input(fetchPageInputSchema)
    .output(fetchPagesOutputSchema)
-    .mutation(async ({ input: { access, subjects, screenshot } }) => {
-      const pages: FetchPageWorkerOutputSchema[] = [];
+    .mutation(async ({ input: { access, requests } }) => {

-      for (const subject of subjects) {
-        try {
-          pages.push(await workerPuppeteer(access, subject.url, screenshot?.width, screenshot?.height, screenshot?.quality));
-        } catch (error: any) {
-          pages.push({
-            url: subject.url,
-            content: '',
-            error: error?.message || JSON.stringify(error) || 'Unknown fetch error',
+      const pagePromises = requests.map(request =>
+        workerPuppeteer(access, request.url, request.transforms, request.screenshot));
+
+      const results = await Promise.allSettled(pagePromises);
+
+      const pages: FetchPageWorkerOutputSchema[] = results.map((result, index) =>
+        result.status === 'fulfilled'
+          ? result.value
+          : {
+            url: requests[index].url,
+            content: {},
+            error: result.reason?.message || 'Unknown fetch error',
            stopReason: 'error',
-          });
-        }
-      }
+          },
+      );

      return { pages };
    }),
@@ -77,12 +89,13 @@ export const browseRouter = createTRPCRouter({
 });


-type BrowseAccessSchema = z.infer<typeof browseAccessSchema>;
-type FetchPageWorkerOutputSchema = z.infer<typeof fetchPageWorkerOutputSchema>;
+async function workerPuppeteer(
+  access: BrowseAccessSchema,
+  targetUrl: string,
+  transforms: PageTransformSchema[],
+  screenshotOptions?: { width: number, height: number, quality?: number },
+): Promise<FetchPageWorkerOutputSchema> {

-async function workerPuppeteer(access: BrowseAccessSchema, targetUrl: string, ssWidth: number | undefined, ssHeight: number | undefined, ssQuality: number | undefined): Promise<FetchPageWorkerOutputSchema> {
-
-  // access
  const browserWSEndpoint = (access.wssEndpoint || env.PUPPETEER_WSS_ENDPOINT || '').trim();
  const isLocalBrowser = browserWSEndpoint.startsWith('ws://');
  if (!browserWSEndpoint || (!browserWSEndpoint.startsWith('wss://') && !isLocalBrowser))
@@ -93,7 +106,7 @@ async function workerPuppeteer(access: BrowseAccessSchema, targetUrl: string, ss

  const result: FetchPageWorkerOutputSchema = {
    url: targetUrl,
-    content: '',
+    content: {},
    error: undefined,
    stopReason: 'error',
    screenshot: undefined,
@@ -117,35 +130,49 @@ async function workerPuppeteer(access: BrowseAccessSchema, targetUrl: string, ss
    if (!isWebPage) {
      // noinspection ExceptionCaughtLocallyJS
      throw new Error(`Invalid content-type: ${contentType}`);
-    } else
+    } else {
      result.stopReason = 'end';
+    }
  } catch (error: any) {
-    const isTimeout: boolean = error instanceof TimeoutError;
+    // This was "error instanceof TimeoutError;" but threw some type error - trying the below instead
+    const isTimeout = error?.message?.includes('Navigation timeout') || false;
    result.stopReason = isTimeout ? 'timeout' : 'error';
-    if (!isTimeout)
-      result.error = '[Puppeteer] ' + error?.message || error?.toString() || 'Unknown goto error';
+    if (!isTimeout) {
+      result.error = '[Puppeteer] ' + (error?.message || error?.toString() || 'Unknown goto error');
+    }
  }

  // transform the content of the page as text
  try {
    if (result.stopReason !== 'error') {
-      result.content = await page.evaluate(() => {
-        const content = document.body.innerText || document.textContent;
-        if (!content)
-          throw new Error('No content');
-        return content;
-      });
+      for (const transform of transforms) {
+        switch (transform) {
+          case 'html':
+            result.content.html = cleanHtml(await page.content());
+            break;
+          case 'text':
+            result.content.text = await page.evaluate(() => document.body.innerText || document.textContent || '');
+            break;
+          case 'markdown':
+            const html = await page.content();
+            const cleanedHtml = cleanHtml(html);
+            const turndownService = new TurndownService({ headingStyle: 'atx' });
+            result.content.markdown = turndownService.turndown(cleanedHtml);
+            break;
+        }
+      }
+      if (!Object.keys(result.content).length)
+        result.error = '[Puppeteer] Empty content';
    }
  } catch (error: any) {
-    result.error = '[Puppeteer] ' + error?.message || error?.toString() || 'Unknown evaluate error';
+    result.error = '[Puppeteer] ' + (error?.message || error?.toString() || 'Unknown content error');
  }

  // get a screenshot of the page
  try {
-    if (ssWidth && ssHeight) {
-      const width = ssWidth;
-      const height = ssHeight;
-      const scale = Math.round(100 * ssWidth / 1024) / 100;
+    if (screenshotOptions?.width && screenshotOptions?.height) {
+      const { width, height, quality } = screenshotOptions;
+      const scale = Math.round(100 * width / 1024) / 100;

      await page.setViewport({ width: width / scale, height: height / scale, deviceScaleFactor: scale });

@@ -156,10 +183,10 @@ async function workerPuppeteer(access: BrowseAccessSchema, targetUrl: string, ss
        type: imageType,
        encoding: 'base64',
        clip: { x: 0, y: 0, width: width / scale, height: height / scale },
-        ...(ssQuality && { quality: ssQuality }),
+        ...(quality && { quality }),
      }) as string;

-      result.screenshot = { imageDataUrl: `data:${mimeType};base64,${dataString}`, mimeType, width, height };
+      result.screenshot = { webpDataUrl: `data:${mimeType};base64,${dataString}`, mimeType, width, height };
    }
  } catch (error: any) {
    console.error('workerPuppeteer: page.screenshot', error);
@@ -192,3 +219,35 @@ async function workerPuppeteer(access: BrowseAccessSchema, targetUrl: string, ss

  return result;
 }
+
+
+function cleanHtml(html: string) {
+  const $ = cheerioLoad(html);
+
+  // Remove standard unwanted elements
+  $('script, style, nav, aside, noscript, iframe, svg, canvas, .ads, .comments, link[rel="stylesheet"]').remove();
+
+  // Remove elements that might be specific to proxy services or injected by them
+  $('[id^="brightdata-"], [class^="brightdata-"]').remove();
+
+  // Remove comments
+  $('*').contents().filter(function() {
+    return this.type === 'comment';
+  }).remove();
+
+  // Remove empty elements
+  $('p, div, span').each(function() {
+    if ($(this).text().trim() === '' && $(this).children().length === 0) {
+      $(this).remove();
+    }
+  });
+
+  // Merge consecutive paragraphs
+  $('p + p').each(function() {
+    $(this).prev().append(' ' + $(this).text());
+    $(this).remove();
+  });
+
+  // Return the cleaned HTML
+  return $.html();
+}
@@ -5,11 +5,16 @@ import { CapabilityBrowsing } from '~/common/components/useCapabilities';
 import { getBackendCapabilities } from '~/modules/backend/store-backend-capabilities';


+export type BrowsePageTransform = 'html' | 'text' | 'markdown';
+
 interface BrowseState {

  wssEndpoint: string;
  setWssEndpoint: (url: string) => void;

+  pageTransform: BrowsePageTransform;
+  setPageTransform: (transform: BrowsePageTransform) => void;
+
  enableCommandBrowse: boolean;
  setEnableCommandBrowse: (value: boolean) => void;

@@ -31,6 +36,9 @@ export const useBrowseStore = create<BrowseState>()(
      wssEndpoint: '', // default WSS endpoint
      setWssEndpoint: (wssEndpoint: string) => set(() => ({ wssEndpoint })),

+      pageTransform: 'text',
+      setPageTransform: (pageTransform: BrowsePageTransform) => set(() => ({ pageTransform })),
+
      enableCommandBrowse: true,
      setEnableCommandBrowse: (enableCommandBrowse: boolean) => set(() => ({ enableCommandBrowse })),

@@ -2,6 +2,7 @@ import * as React from 'react';

 import { FormControl } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { useCapabilityElevenLabs } from '~/common/components/useCapabilities';
@@ -31,7 +32,7 @@ export function ElevenlabsSettings() {

    {!isConfiguredServerSide && <FormInputKey
      autoCompleteId='elevenlabs-key' label='ElevenLabs API Key'
-      rightLabel={isConfiguredServerSide ? '✔️ already set in server' : 'required'}
+      rightLabel={<AlreadySet required={!isConfiguredServerSide} />}
      value={apiKey} onChange={setApiKey}
      required={!isConfiguredServerSide} isError={!isValidKey}
    />}
@@ -2,10 +2,10 @@ import { sendGAEvent } from '@next/third-parties/google';

 import { hasGoogleAnalytics } from '~/common/components/GoogleAnalytics';

-import type { ModelDescriptionSchema } from './server/llm.server.types';
+import type { GenerateContextNameSchema, ModelDescriptionSchema, StreamingContextNameSchema } from './server/llm.server.types';
 import type { OpenAIWire } from './server/openai/openai.wiretypes';
 import type { StreamingClientUpdate } from './vendors/unifiedStreamingClient';
-import { DLLM, DLLMId, DModelSource, DModelSourceId, LLM_IF_OAI_Chat, useModelsStore } from './store-llms';
+import { DLLM, DLLMId, DModelSource, DModelSourceId, LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, useModelsStore } from './store-llms';
 import { FALLBACK_LLM_TEMPERATURE } from './vendors/openai/openai.vendor';
 import { findAccessForSourceOrThrow, findVendorForLlmOrThrow } from './vendors/vendors.registry';

@@ -21,6 +21,10 @@ export interface VChatMessageIn {

 export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;

+export type VChatStreamContextName = StreamingContextNameSchema;
+export type VChatGenerateContextName = GenerateContextNameSchema;
+export type VChatContextRef = string;
+
 export interface VChatMessageOut {
  role: 'assistant' | 'system' | 'user';
  content: string;
@@ -69,7 +73,7 @@ function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: M
  // null means unknown contenxt/output tokens
  const contextTokens = model.contextWindow || null;
  const maxOutputTokens = model.maxCompletionTokens || (contextTokens ? Math.round(contextTokens / 2) : null);
-  const llmResponseTokensRatio = model.maxCompletionTokens ? 1 / 2 : 1 / 4;
+  const llmResponseTokensRatio = model.maxCompletionTokens ? 1 : 1 / 4;
  const llmResponseTokens = maxOutputTokens ? Math.round(maxOutputTokens * llmResponseTokensRatio) : null;

  return {
@@ -112,13 +116,20 @@ function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: M
 export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
  llmId: DLLMId,
  messages: VChatMessageIn[],
-  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+  contextName: VChatGenerateContextName,
+  contextRef: VChatContextRef | null,
+  functions: VChatFunctionIn[] | null,
+  forceFunctionName: string | null,
  maxTokens?: number,
 ): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {

  // id to DLLM and vendor
  const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);

+  // if the model does not support function calling and we're trying to force a function, throw
+  if (forceFunctionName && !llm.interfaces.includes(LLM_IF_OAI_Fn))
+    throw new Error(`Model ${llmId} does not support function calling`);
+
  // FIXME: relax the forced cast
  const options = llm.options as TLLMOptions;

@@ -132,13 +143,15 @@ export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = u
    await new Promise(resolve => setTimeout(resolve, delay));

  // execute via the vendor
-  return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
+  return await vendor.rpcChatGenerateOrThrow(access, options, messages, contextName, contextRef, functions, forceFunctionName, maxTokens);
 }


 export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
  llmId: DLLMId,
  messages: VChatMessageIn[],
+  contextName: VChatStreamContextName,
+  contextRef: VChatContextRef,
  functions: VChatFunctionIn[] | null,
  forceFunctionName: string | null,
  abortSignal: AbortSignal,
@@ -161,5 +174,5 @@ export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess =
    await new Promise(resolve => setTimeout(resolve, delay));

  // execute via the vendor
-  return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
+  return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, contextName, contextRef, functions, forceFunctionName, abortSignal, onUpdate);
 }
@@ -1,4 +1,5 @@
 import * as React from 'react';
+import TimeAgo from 'react-timeago';
 import { shallow } from 'zustand/shallow';

 import { Box, Button, ButtonGroup, Divider, FormControl, Input, Switch, Tooltip, Typography } from '@mui/joy';
@@ -132,10 +133,10 @@ export function LLMOptionsModal(props: { id: DLLMId, onClose: () => void }) {
            llm id: {llm.id}<br />
            context tokens: <b>{llm.contextTokens ? llm.contextTokens.toLocaleString() : 'not provided'}</b>{` · `}
            max output tokens: <b>{llm.maxOutputTokens ? llm.maxOutputTokens.toLocaleString() : 'not provided'}</b><br />
-            {!!llm.created && <>created: {(new Date(llm.created * 1000)).toLocaleString()}<br /></>}
+            {!!llm.created && <>created: <TimeAgo date={new Date(llm.created * 1000)} /><br /></>}
            {/*· tags: {llm.tags.join(', ')}*/}
            {!!llm.pricing && <>pricing: $<b>{llm.pricing.chatIn || '(unk) '}</b>/M in, $<b>{llm.pricing.chatOut || '(unk) '}</b>/M out<br /></>}
-            {!!llm.benchmark && <>benchmark: <b>{llm.benchmark.cbaElo?.toLocaleString() || '(unk) '}</b> CBA Elo<br /></>}
+            {/*{!!llm.benchmark && <>benchmark: <b>{llm.benchmark.cbaElo?.toLocaleString() || '(unk) '}</b> CBA Elo<br /></>}*/}
            config: {JSON.stringify(llm.options)}
          </Typography>
        </Box>}
@@ -4,14 +4,71 @@ import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';

 const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);

-export const hardcodedAnthropicModels: ModelDescriptionSchema[] = [
+export const hardcodedAnthropicModels: (ModelDescriptionSchema & { isLegacy?: boolean })[] = [
+  // Claude 3.5 models - https://docs.anthropic.com/en/docs/about-claude/models
+  // {
+  //   id: 'claude-3.5-opus', // ...
+  //   label: 'Claude 3.5 Opus',
+  //   created: roundTime(?),
+  //   description: ?,
+  //   contextWindow: 200000 ?, // Characters
+  //   maxCompletionTokens: 4096 ?,
+  //   trainingDataCutoff: ?,
+  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+  //   pricing: { chatIn: 15, chatOut: 75 },
+  //   benchmark: {
+  //     cbaElo: 1256, // Placeholder
+  //     cbaMmlu: 86.8, // Placeholder
+  //   },
+  // },
+  {
+    id: 'claude-3-5-sonnet-20241022',
+    label: 'Claude 3.5 Sonnet',
+    created: roundTime('2024-10-22 06:00'),
+    description: 'Most intelligent Claude model to date',
+    contextWindow: 200000, // Characters
+    maxCompletionTokens: 8192,
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 3, chatOut: 15 },
+    benchmark: { cbaElo: 1269, cbaMmlu: 88.7 }, // moved from 3.5 Sonnet (Previous Version), TO UPDATE!!
+  },
+  {
+    id: 'claude-3-5-sonnet-20240620',
+    label: 'Claude 3.5 Sonnet (Previous)',
+    created: roundTime('2024-06-20 06:00'),
+    description: 'The most intelligent Claude model',
+    contextWindow: 200000, // Characters
+    maxCompletionTokens: 8192,
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 3, chatOut: 15 },
+    benchmark: { cbaElo: 1269 - 0.1, cbaMmlu: 88.7 - 0.1 },
+    hidden: true,
+  },
+  // {
+  //   id: 'claude-3.5-haiku', // ...
+  //   label: 'Claude 3.5 Haiku',
+  //   created: roundTime(?),
+  //   description: ?,
+  //   contextWindow: 200000 ?, // Characters
+  //   maxCompletionTokens: 4096 ?,
+  //   trainingDataCutoff: ?,
+  //   interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+  //   pricing: { chatIn: 0.25, chatOut: 1.25 },
+  //   benchmark: {
+  //     cbaElo: 1181, // Placeholder
+  //     cbaMmlu: 75.2, // Placeholder
+  //   },
+  // },

-  // Claude-3 models - https://docs.anthropic.com/claude/docs/models-overview#model-comparison
+
+  // Claude 3 models
  {
    id: 'claude-3-opus-20240229',
    label: 'Claude 3 Opus',
    created: roundTime('2024-02-29'),
-    description: 'Most powerful model for highly complex tasks',
+    description: 'Powerful model for complex tasks',
    contextWindow: 200000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Aug 2023',
@@ -23,19 +80,21 @@ export const hardcodedAnthropicModels: ModelDescriptionSchema[] = [
    id: 'claude-3-sonnet-20240229',
    label: 'Claude 3 Sonnet',
    created: roundTime('2024-02-29'),
-    description: 'Ideal balance of intelligence and speed for enterprise workloads',
+    description: 'Balance of speed, cost, and performance',
    contextWindow: 200000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Aug 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
    pricing: { chatIn: 3, chatOut: 15 },
    benchmark: { cbaElo: 1203, cbaMmlu: 79 },
+    hidden: true,
+    isLegacy: true,
  },
  {
    id: 'claude-3-haiku-20240307',
    label: 'Claude 3 Haiku',
    created: roundTime('2024-03-07'),
-    description: 'Fastest and most compact model for near-instant responsiveness',
+    description: 'Fastest, most cost-effective model',
    contextWindow: 200000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Aug 2023',
@@ -55,6 +114,7 @@ export const hardcodedAnthropicModels: ModelDescriptionSchema[] = [
    interfaces: [LLM_IF_OAI_Chat],
    pricing: { chatIn: 8, chatOut: 24 },
    benchmark: { cbaElo: 1119 },
+    hidden: true,
  },
  {
    id: 'claude-2.0',
@@ -77,25 +137,6 @@ export const hardcodedAnthropicModels: ModelDescriptionSchema[] = [
    maxCompletionTokens: 4096,
    interfaces: [LLM_IF_OAI_Chat],
    pricing: { chatIn: 0.8, chatOut: 2.4 },
-  },
-  {
-    id: 'claude-instant-1.1',
-    label: 'Claude Instant 1.1',
-    created: roundTime('2023-03-14'),
-    description: 'Precise and fast',
-    contextWindow: 100000,
-    maxCompletionTokens: 2048,
-    interfaces: [LLM_IF_OAI_Chat],
-    hidden: true,
-  },
-  {
-    id: 'claude-1.3',
-    label: 'Claude 1.3',
-    created: roundTime('2023-03-14'),
-    description: 'Claude 1.3 is the latest version of Claude v1',
-    contextWindow: 100000,
-    maxCompletionTokens: 4096,
-    interfaces: [LLM_IF_OAI_Chat],
    hidden: true,
  },
 ];
@@ -8,7 +8,7 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.router.fetchers';
 import { fixupHost } from '~/common/util/urlUtils';

 import { OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { llmsChatGenerateOutputSchema, llmsListModelsOutputSchema } from '../llm.server.types';
+import { llmsChatGenerateOutputSchema, llmsGenerateContextSchema, llmsListModelsOutputSchema } from '../llm.server.types';

 import { AnthropicWireMessagesRequest, anthropicWireMessagesRequestSchema, AnthropicWireMessagesResponse, anthropicWireMessagesResponseSchema } from './anthropic.wiretypes';
 import { hardcodedAnthropicModels } from './anthropic.models';
@@ -17,7 +17,9 @@ import { hardcodedAnthropicModels } from './anthropic.models';
 // Default hosts
 const DEFAULT_API_VERSION_HEADERS = {
  'anthropic-version': '2023-06-01',
-  'anthropic-beta': 'messages-2023-12-15',
+  // Former Betas:
+  // - messages-2023-12-15: to use the Messages API
+  'anthropic-beta': 'max-tokens-3-5-sonnet-2024-07-15',
 };
 const DEFAULT_MAX_TOKENS = 2048;
 const DEFAULT_ANTHROPIC_HOST = 'api.anthropic.com';
@@ -158,7 +160,11 @@ const listModelsInputSchema = z.object({

 const chatGenerateInputSchema = z.object({
  access: anthropicAccessSchema,
-  model: openAIModelSchema, history: openAIHistorySchema,
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
+  // functions: openAIFunctionsSchema.optional(),
+  // forceFunctionName: z.string().optional(),
+  context: llmsGenerateContextSchema.optional(),
 });


@@ -1,16 +1,166 @@
 import type { GeminiModelSchema } from './gemini.wiretypes';
 import type { ModelDescriptionSchema } from '../llm.server.types';
-import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Json, LLM_IF_OAI_Vision } from '../../store-llms';


+// dev options
+const DEV_DEBUG_GEMINI_MODELS = false;
+
+
+// supported interfaces
+const geminiChatInterfaces: GeminiModelSchema['supportedGenerationMethods'] = ['generateContent'];
+
+// unsupported interfaces
 const filterUnallowedNames = ['Legacy'];
 const filterUnallowedInterfaces: GeminiModelSchema['supportedGenerationMethods'] = ['generateAnswer', 'embedContent', 'embedText'];

-const geminiLinkModels = ['models/gemini-pro', 'models/gemini-pro-vision'];

-// interfaces mapping
-const geminiChatInterfaces: GeminiModelSchema['supportedGenerationMethods'] = ['generateContent'];
-const geminiVisionNames = ['-vision'];
+/* Manual models details
+   Gemini Name Mapping example:
+   - Latest version    gemini-1.0-pro-latest    <model>-<generation>-<variation>-latest
+   - Latest stable     version  gemini-1.0-pro  <model>-<generation>-<variation>
+   - Stable versions   gemini-1.0-pro-001       <model>-<generation>-<variation>-<version>
+*/
+const _knownGeminiModels: ({
+  id: string,
+  isNewest?: boolean,
+  isPreview?: boolean
+  symLink?: string
+} & Pick<ModelDescriptionSchema, 'interfaces' | 'pricing' | 'trainingDataCutoff' | 'hidden'>)[] = [
+
+  // Generation 1.5
+  {
+    id: 'models/gemini-1.5-flash-latest', // updated regularly and might be a preview version
+    isNewest: true,
+    isPreview: true,
+    pricing: {
+      chatIn: 0.70,   // 0.35 up to 128k tokens, 0.70 prompts > 128k tokens
+      chatOut: 2.10,  // 1.05 up to 128k tokens, 2.10 prompts > 128k tokens
+    },
+    trainingDataCutoff: 'May 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json], // input: audio, images and text
+  },
+  {
+    id: 'models/gemini-1.5-flash',
+    // copied from above
+    pricing: {
+      chatIn: 0.70,   // 0.35 up to 128k tokens, 0.70 prompts > 128k tokens
+      chatOut: 2.10,  // 1.05 up to 128k tokens, 2.10 prompts > 128k tokens
+    },
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json], // input: audio, images and text
+    hidden: true,
+  },
+  {
+    id: 'models/gemini-1.5-flash-001',
+    // copied from above
+    pricing: {
+      chatIn: 0.70,   // 0.35 up to 128k tokens, 0.70 prompts > 128k tokens
+      chatOut: 2.10,  // 1.05 up to 128k tokens, 2.10 prompts > 128k tokens
+    },
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json], // input: audio, images and text
+    hidden: true,
+  },
+
+  {
+    id: 'models/gemini-1.5-pro-latest', // updated regularly and might be a preview version
+    isNewest: true,
+    isPreview: true,
+    pricing: {
+      chatIn: 7.00,   // $3.50 / 1 million tokens (for prompts up to 128K tokens), $7.00 / 1 million tokens (for prompts longer than 128K)
+      chatOut: 21.00, // $10.50 / 1 million tokens (128K or less), $21.00 / 1 million tokens (128K+)
+    },
+    trainingDataCutoff: 'May 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json], // input: audio, images and text
+  },
+  {
+    id: 'models/gemini-1.5-pro', // latest stable -> 001
+    // copied from above
+    pricing: {
+      chatIn: 7.00,   // $3.50 / 1 million tokens (for prompts up to 128K tokens), $7.00 / 1 million tokens (for prompts longer than 128K)
+      chatOut: 21.00, // $10.50 / 1 million tokens (128K or less), $21.00 / 1 million tokens (128K+)
+    },
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json],
+    hidden: true,
+  },
+  {
+    id: 'models/gemini-1.5-pro-001', // stable snapshot
+    // copied from above
+    pricing: {
+      chatIn: 7.00,   // $3.50 / 1 million tokens (for prompts up to 128K tokens), $7.00 / 1 million tokens (for prompts longer than 128K)
+      chatOut: 21.00, // $10.50 / 1 million tokens (128K or less), $21.00 / 1 million tokens (128K+)
+    },
+    trainingDataCutoff: 'Apr 2024',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Json],
+    hidden: true,
+  },
+
+
+  // Generation 1.0
+  {
+    id: 'models/gemini-1.0-pro-latest',
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat],
+  },
+  {
+    id: 'models/gemini-1.0-pro',
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat],
+    hidden: true,
+  },
+  {
+    id: 'models/gemini-1.0-pro-001',
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat],
+    hidden: true,
+  },
+
+  // Generation 1.0 + Vision
+  {
+    id: 'models/gemini-1.0-pro-vision-latest',
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision], // Text and Images
+    hidden: true,
+  },
+
+  // Older symlinks
+  {
+    id: 'models/gemini-pro',
+    symLink: 'models/gemini-1.0-pro',
+    // copied from symlinked
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat],
+    hidden: true,
+  },
+  {
+    id: 'models/gemini-pro-vision',
+    // copied from symlinked
+    symLink: 'models/gemini-1.0-pro-vision',
+    pricing: {
+      chatIn: 0.50,
+      chatOut: 1.50,
+    },
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision], // Text and Images
+    hidden: true,
+  },
+];


 export function geminiFilterModels(geminiModel: GeminiModelSchema): boolean {
@@ -26,17 +176,23 @@ export function geminiSortModels(a: ModelDescriptionSchema, b: ModelDescriptionS
  return b.label.localeCompare(a.label);
 }

-export function geminiModelToModelDescription(geminiModel: GeminiModelSchema, allModels: GeminiModelSchema[]): ModelDescriptionSchema {
+export function geminiModelToModelDescription(geminiModel: GeminiModelSchema): ModelDescriptionSchema {
  const { description, displayName, name: modelId, supportedGenerationMethods } = geminiModel;

+  if (DEV_DEBUG_GEMINI_MODELS)
+    console.log('geminiModelToModelDescription', geminiModel);
+
+  // find known manual mapping
+  const knownModel = _knownGeminiModels.find(m => m.id === modelId);
+
  // handle symlinks
-  const isSymlink = geminiLinkModels.includes(modelId);
-  const symlinked = isSymlink ? allModels.find(m => m.displayName === displayName && m.name !== modelId) : null;
-  const label = isSymlink ? `🔗 ${displayName.replace('1.0', '')} → ${symlinked ? symlinked.name : '?'}` : displayName;
+  const label = knownModel?.symLink
+    ? `🔗 ${displayName.replace('1.0', '')} → ${knownModel.symLink}`
+    : displayName;

  // handle hidden models
  const hasChatInterfaces = supportedGenerationMethods.some(iface => geminiChatInterfaces.includes(iface));
-  const hidden = isSymlink || !hasChatInterfaces;
+  const hidden = knownModel?.hidden || !!knownModel?.symLink || !hasChatInterfaces;

  // context window
  const { inputTokenLimit, outputTokenLimit } = geminiModel;
@@ -46,26 +202,27 @@ export function geminiModelToModelDescription(geminiModel: GeminiModelSchema, al
  const { version, topK, topP, temperature } = geminiModel;
  const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;

-  const interfaces: ModelDescriptionSchema['interfaces'] = [];
-  if (hasChatInterfaces) {
+  // use known interfaces, or add chat if this is a generateContent model
+  const interfaces: ModelDescriptionSchema['interfaces'] = knownModel?.interfaces || [];
+  if (!interfaces.length && hasChatInterfaces) {
    interfaces.push(LLM_IF_OAI_Chat);
-    if (geminiVisionNames.some(name => modelId.includes(name)))
-      interfaces.push(LLM_IF_OAI_Vision);
+    // if (geminiVisionNames.some(name => modelId.includes(name)))
+    //   interfaces.push(LLM_IF_OAI_Vision);
  }

  return {
    id: modelId,
-    label,
+    label: label, // + (knownModel?.isNewest ? ' 🌟' : ''),
    // created: ...
    // updated: ...
    description: descriptionLong,
    contextWindow: contextWindow,
    maxCompletionTokens: outputTokenLimit,
-    // trainingDataCutoff: '...',
+    trainingDataCutoff: knownModel?.trainingDataCutoff,
    interfaces,
    // rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
    // benchmarks: ...
-    // pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
+    pricing: knownModel?.pricing, // TODO: needs <>128k, and per-character and per-image pricing
    hidden,
  };
 }
@@ -8,7 +8,7 @@ import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { fetchJsonOrTRPCError } from '~/server/api/trpc.router.fetchers';

 import { fixupHost } from '~/common/util/urlUtils';
-import { llmsChatGenerateOutputSchema, llmsListModelsOutputSchema } from '../llm.server.types';
+import { llmsChatGenerateOutputSchema, llmsGenerateContextSchema, llmsListModelsOutputSchema } from '../llm.server.types';

 import { OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';

@@ -120,8 +120,11 @@ const accessOnlySchema = z.object({

 const chatGenerateInputSchema = z.object({
  access: geminiAccessSchema,
-  model: openAIModelSchema, history: openAIHistorySchema,
-  // functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
+  // functions: openAIFunctionsSchema.optional(),
+  // forceFunctionName: z.string().optional(),
+  context: llmsGenerateContextSchema.optional(),
 });


@@ -147,7 +150,7 @@ export const llmGeminiRouter = createTRPCRouter({
      // map to our output schema
      const models = detailedModels
        .filter(geminiFilterModels)
-        .map(geminiModel => geminiModelToModelDescription(geminiModel, detailedModels))
+        .map(geminiModel => geminiModelToModelDescription(geminiModel))
        .sort(geminiSortModels);

      return {
@@ -9,6 +9,21 @@ export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:s


 // models.list = /v1beta/models
+const Methods_enum = z.enum([
+  'bidiGenerateContent', // appeared on 2024-12, see https://github.com/enricoros/big-AGI/issues/700
+  'createCachedContent', // appeared on 2024-06-10, see https://github.com/enricoros/big-AGI/issues/565
+  'countMessageTokens',
+  'countTextTokens',
+  'countTokens',
+  'createTunedModel',
+  'createTunedTextModel',
+  'embedContent',
+  'embedText',
+  'generateAnswer',
+  'generateContent',
+  'generateMessage',
+  'generateText',
+]);

 const geminiModelSchema = z.object({
  name: z.string(),
@@ -17,19 +32,7 @@ const geminiModelSchema = z.object({
  description: z.string(),
  inputTokenLimit: z.number().int().min(1),
  outputTokenLimit: z.number().int().min(1),
-  supportedGenerationMethods: z.array(z.enum([
-    'countMessageTokens',
-    'countTextTokens',
-    'countTokens',
-    'createTunedModel',
-    'createTunedTextModel',
-    'embedContent',
-    'embedText',
-    'generateAnswer',
-    'generateContent',
-    'generateMessage',
-    'generateText',
-  ])),
+  supportedGenerationMethods: z.array(z.union([Methods_enum, z.string()])), // relaxed with z.union to not break on expansion
  temperature: z.number().optional(),
  topP: z.number().optional(),
  topK: z.number().optional(),
@@ -171,7 +174,7 @@ export const geminiGeneratedContentResponseSchema = z.object({
  // either all requested candidates are returned or no candidates at all
  // no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
  candidates: z.array(z.object({
-    index: z.number(),
+    index: z.number().optional(),
    content: geminiContentSchema.optional(), // this can be missing if the finishReason is not 'MAX_TOKENS'
    finishReason: geminiFinishReasonSchema.optional(),
    safetyRatings: z.array(geminiSafetyRatingSchema).optional(), // undefined when finishReason is 'RECITATION'
@@ -19,7 +19,10 @@ import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletio

 // OpenAI server imports
 import type { OpenAIWire } from './openai/openai.wiretypes';
-import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from './openai/openai.router';
+import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
+
+
+import { llmsStreamingContextSchema } from './llm.server.types';


 // configuration
@@ -51,6 +54,9 @@ const chatStreamingInputSchema = z.object({
  access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
  model: openAIModelSchema,
  history: openAIHistorySchema,
+  // NOTE: made it optional for now as we have some old requests without it
+  // 2024-07-07: remove .optional()
+  context: llmsStreamingContextSchema.optional(),
 });
 export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;

@@ -72,14 +78,15 @@ export async function llmStreamingRelayHandler(req: NextRequest): Promise<Respon

  // Parse the request
  const body = await req.json();
-  const { access, model, history } = chatStreamingInputSchema.parse(body);
-  const prettyDialect = serverCapitalizeFirstLetter(access.dialect);
+  const _chatStreamingInput: ChatStreamingInputSchema = chatStreamingInputSchema.parse(body);
+  const { dialect: accessDialect } = _chatStreamingInput.access;
+  const prettyDialect = serverCapitalizeFirstLetter(accessDialect);


  // Prepare the upstream API request and demuxer/parser
  let requestData: ReturnType<typeof _prepareRequestData>;
  try {
-    requestData = _prepareRequestData(access, model, history);
+    requestData = _prepareRequestData(_chatStreamingInput);
  } catch (error: any) {
    console.error(`[POST] /api/llms/stream: ${prettyDialect}: prepareRequestData issue:`, safeErrorString(error));
    return new NextResponse(`**[Service Issue] ${prettyDialect}**: ${safeErrorString(error) || 'Unknown streaming error'}`, {
@@ -103,7 +110,7 @@ export async function llmStreamingRelayHandler(req: NextRequest): Promise<Respon
  } catch (error: any) {

    // server-side admins message
-    const capDialect = serverCapitalizeFirstLetter(access.dialect);
+    const capDialect = serverCapitalizeFirstLetter(accessDialect);
    const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + JSON.stringify(error.cause) : '');
    console.error(`[POST] /api/llms/stream: ${capDialect}: fetch issue:`, fetchOrVendorError, requestData?.url);

@@ -125,7 +132,7 @@ export async function llmStreamingRelayHandler(req: NextRequest): Promise<Respon
   * a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
   */
  const transformUpstreamToBigAgiClient = createUpstreamTransformer(
-    requestData.vendorMuxingFormat, requestData.vendorStreamParser, access.dialect,
+    requestData.vendorMuxingFormat, requestData.vendorStreamParser, accessDialect,
  );

  const chatResponseStream =
@@ -486,7 +493,7 @@ function createStreamParserOpenAI(): AIStreamParser {
 }


-function _prepareRequestData(access: ChatStreamingInputSchema['access'], model: OpenAIModelSchema, history: OpenAIHistorySchema): {
+function _prepareRequestData({ access, model, history, context: _context }: ChatStreamingInputSchema): {
  headers: HeadersInit;
  url: string;
  body: object;
@@ -12,6 +12,8 @@ const pricingSchema = z.object({
 const benchmarkSchema = z.object({
  cbaElo: z.number().optional(),
  cbaMmlu: z.number().optional(),
+  heCode: z.number().optional(), // HumanEval, code, 0-shot
+  vqaMmmu: z.number().optional(), // Visual Question Answering, MMMU, 0-shot
 });

 // const rateLimitsSchema = z.object({
@@ -46,6 +48,25 @@ export const llmsListModelsOutputSchema = z.object({
 });


+// Chat Generation Input (some parts of)
+
+const generateContextNameSchema = z.enum(['chat-ai-title', 'chat-ai-summarize', 'chat-followup-diagram', 'chat-react-turn', 'draw-expand-prompt']);
+export type GenerateContextNameSchema = z.infer<typeof generateContextNameSchema>;
+export const llmsGenerateContextSchema = z.object({
+  method: z.literal('chat-generate'),
+  name: generateContextNameSchema,
+  ref: z.string(),
+});
+
+const streamingContextNameSchema = z.enum(['conversation', 'ai-diagram', 'ai-flattener', 'call', 'beam-scatter', 'beam-gather', 'persona-extract']);
+export type StreamingContextNameSchema = z.infer<typeof streamingContextNameSchema>;
+export const llmsStreamingContextSchema = z.object({
+  method: z.literal('chat-stream'),
+  name: streamingContextNameSchema,
+  ref: z.string(),
+});
+
+
 // (non-streaming) Chat Generation Output

 export const llmsChatGenerateOutputSchema = z.object({
@@ -11,7 +11,7 @@ import { capitalizeFirstLetter } from '~/common/util/textUtils';
 import { fixupHost } from '~/common/util/urlUtils';

 import { OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { llmsChatGenerateOutputSchema, llmsListModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
+import { llmsChatGenerateOutputSchema, llmsGenerateContextSchema, llmsListModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';

 import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
 import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema, wireOllamaListModelsSchema, wireOllamaModelInfoSchema } from './ollama.wiretypes';
@@ -117,8 +117,11 @@ const adminPullModelSchema = z.object({

 const chatGenerateInputSchema = z.object({
  access: ollamaAccessSchema,
-  model: openAIModelSchema, history: openAIHistorySchema,
-  // functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
+  // functions: openAIFunctionsSchema.optional(),
+  // forceFunctionName: z.string().optional(),
+  context: llmsGenerateContextSchema.optional(),
 });

 const listPullableOutputSchema = z.object({
@@ -0,0 +1,84 @@
+// here for reference only - for future mapping of CBA scores to the model IDs
+// const modelIdToPrefixMap: { [key: string]: string } = {
+//   // Anthropic models
+//   'Claude 3.5 Sonnet': 'claude-3-5-sonnet-20240620',
+//   'Claude 3 Opus': 'claude-3-opus-20240229',
+//   'Claude 3 Sonnet': 'claude-3-sonnet-20240229',
+//   'Claude 3 Haiku': 'claude-3-haiku-20240307',
+//   'Claude-2.1': 'claude-2.1',
+//   'Claude-2.0': 'claude-2.0',
+//   'Claude-1': '', // No exact match
+//   'Claude-Instant-1': 'claude-instant-1.2', // Closest match
+//
+//   // Gemini models
+//   'Gemini-1.5-Pro-Exp-0801': 'models/gemini-1.5-pro-latest', // Closest match
+//   'Gemini Advanced App (2024-05-14)': '', // No exact match
+//   'Gemini-1.5-Pro-001': 'models/gemini-1.5-pro-001',
+//   'Gemini-1.5-Pro-Preview-0409': 'models/gemini-1.5-pro-latest', // Closest match
+//   'Gemini-1.5-Flash-001': 'models/gemini-1.5-flash-001',
+//   'Gemini App (2024-01-24)': '', // No exact match
+//   'Gemini-1.0-Pro-001': 'models/gemini-1.0-pro-001',
+//   'Gemini Pro': 'models/gemini-pro',
+//
+//   // OpenAI models (from the previous file)
+//   'GPT-4o-2024-05-13': 'gpt-4o-2024-05-13',
+//   'GPT-4o-mini-2024-07-18': 'gpt-4o-mini-2024-07-18',
+//   'GPT-4-Turbo-2024-04-09': 'gpt-4-turbo-2024-04-09',
+//   'GPT-4-1106-preview': 'gpt-4-1106-preview',
+//   'GPT-4-0125-preview': 'gpt-4-0125-preview',
+//   'GPT-4-0314': 'gpt-4-0314',
+//   'GPT-4-0613': 'gpt-4-0613',
+//   'GPT-3.5-Turbo-0613': 'gpt-3.5-turbo-0613',
+//   'GPT-3.5-Turbo-0314': 'gpt-3.5-turbo-0314',
+//   'GPT-3.5-Turbo-0125': 'gpt-3.5-turbo-0125',
+//
+//   // Mistral models (from the previous file)
+//   'Mistral-Large-2402': 'mistral-large-2402',
+//   'Mixtral-8x7b-Instruct-v0.1': 'mistralai/Mixtral-8x7B-Instruct-v0.1',
+//
+//   // Other models without matches
+//   'Gemini-1.5-Pro-Exp-0801': '',
+//   'Meta-Llama-3.1-405b-Instruct': '',
+//   'Gemini-1.5-Pro-001': '',
+//   'Meta-Llama-3.1-70b-Instruct': '',
+//   'Yi-Large-preview': '',
+//   'Deepseek-v2-API-0628': '',
+//   'Gemma-2-27b-it': '',
+//   'Yi-Large': '',
+//   'Nemotron-4-340B-Instruct': '',
+//   'GLM-4-0520': '',
+//   'Llama-3-70b-Instruct': '',
+//   'Reka-Core-20240501': '',
+//   'Command R+': '',
+//   'Gemma-2-9b-it': '',
+//   'Qwen2-72B-Instruct': '',
+//   'GLM-4-0116': '',
+//   'Qwen-Max-0428': '',
+//   'DeepSeek-Coder-V2-Instruct': '',
+//   'Reka-Flash-Preview-20240611': '',
+//   'Meta-Llama-3.1-8b-Instruct': '',
+//   'Qwen1.5-110B-Chat': '',
+//   'Yi-1.5-34B-Chat': '',
+//   'Reka-Flash-21B-online': '',
+//   'Llama-3-8b-Instruct': '',
+//   'Command R': '',
+//   'Reka-Flash-21B': '',
+//   'Qwen1.5-72B-Chat': '',
+//   'Mixtral-8x22b-Instruct-v0.1': '',
+//   'Zephyr-ORPO-141b-A35b-v0.1': '',
+//   'Qwen1.5-32B-Chat': '',
+//   'Mistral-Next': '',
+//   'Phi-3-Medium-4k-Instruct': '',
+//   'Starling-LM-7B-beta': '',
+//   'Yi-34B-Chat': '',
+//   'Qwen1.5-14B-Chat': '',
+//   'WizardLM-70B-v1.0': '',
+//   'Tulu-2-DPO-70B': '',
+//   'DBRX-Instruct-Preview': '',
+//   'Phi-3-Small-8k-Instruct': '',
+//   'Llama-2-70b-chat': '',
+//   'OpenChat-3.5-0106': '',
+//   'Vicuna-33B': '',
+//   'Snowflake Arctic Instruct': '',
+//   'Starling-LM-7B-alpha': '',
+// };
@@ -9,34 +9,139 @@ import { wireTogetherAIListOutputSchema } from './togetherai.wiretypes';


 // [Azure] / [OpenAI]
+// https://platform.openai.com/docs/models
 const _knownOpenAIChatModels: ManualMappings = [

-  // GPT-4o -> 2024-05-13
+  // GPT-4o -> 2024-05-13 (Starting October 2nd, 2024, gpt-4o will point to the gpt-4o-2024-08-06 snapshot)
  {
    idPrefix: 'gpt-4o',
    label: 'GPT-4o',
-    description: 'Currently points to gpt-4o-2024-05-13.',
-    symLink: 'gpt-4o-2024-05-13',
+    description: 'Points to gpt-4o-2024-08-06 starting on Oct 2, 2024.',
+    symLink: 'gpt-4o-2024-08-06',
    hidden: true,
    // copied from symlinked
    contextWindow: 128000,
-    maxCompletionTokens: 4096,
+    maxCompletionTokens: 16384,
    trainingDataCutoff: 'Oct 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
-    pricing: { chatIn: 5, chatOut: 15 },
-    benchmark: { cbaElo: 1310 },
+    pricing: { chatIn: 2.5, chatOut: 10 },
+    benchmark: { cbaElo: 1286 + 1 },
  },
  {
    isLatest: true,
+    idPrefix: 'gpt-4o-2024-08-06',
+    label: 'GPT-4o (2024-08-06)',
+    description: 'Latest snapshot that supports Structured Outputs',
+    contextWindow: 128000,
+    maxCompletionTokens: 16384,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json], // + Structured Outputs?
+    pricing: { chatIn: 2.5, chatOut: 10 },
+    benchmark: { cbaElo: 1286 + 1 },
+  },
+  {
    idPrefix: 'gpt-4o-2024-05-13',
    label: 'GPT-4o (2024-05-13)',
-    description: 'Advanced, multimodal flagship model that’s cheaper and faster than GPT-4 Turbo.',
+    description: 'Advanced, multimodal flagship model that\'s cheaper and faster than GPT-4 Turbo.',
    contextWindow: 128000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Oct 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 5, chatOut: 15 },
-    benchmark: { cbaElo: 1310 },
+    benchmark: { cbaElo: 1286 },
+    hidden: true,
+  },
+  {
+    idPrefix: 'chatgpt-4o-latest',
+    label: 'ChatGPT-4o Latest',
+    description: 'Intended for research and evaluation. Dynamic model continuously updated to the current version of GPT-4o in ChatGPT.',
+    contextWindow: 128000,
+    maxCompletionTokens: 16384,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    pricing: { chatIn: 5, chatOut: 15 },
+  },
+
+  // GPT-4o mini
+  {
+    idPrefix: 'gpt-4o-mini',
+    label: 'GPT-4o mini',
+    description: 'Currently points to gpt-4o-mini-2024-07-18.',
+    symLink: 'gpt-4o-mini-2024-07-18',
+    hidden: true,
+    // copied from symlinked
+    contextWindow: 128000,
+    maxCompletionTokens: 16384,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    pricing: { chatIn: 0.15, chatOut: 0.60 },
+    benchmark: { cbaElo: 1277, cbaMmlu: 82.0 },
+  },
+  {
+    idPrefix: 'gpt-4o-mini-2024-07-18',
+    label: 'GPT-4o Mini (2024-07-18)',
+    description: 'Affordable model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo.',
+    contextWindow: 128000,
+    maxCompletionTokens: 16384,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
+    pricing: { chatIn: 0.15, chatOut: 0.60 },
+  },
+
+  // o1-preview
+  {
+    idPrefix: 'o1-preview',
+    label: 'o1 Preview',
+    description: 'Supported in Big-AGI 2. Points to the most recent snapshot of the o1 model: o1-preview-2024-09-12',
+    symLink: 'o1-preview-2024-09-12',
+    hidden: true,
+    // copied from symlinked
+    contextWindow: 128000,
+    maxCompletionTokens: 32768,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 15, chatOut: 60 },
+    isPreview: true,
+  },
+  {
+    hidden: true, // we can't support it in Big-AGI 1
+    idPrefix: 'o1-preview-2024-09-12',
+    label: 'o1 Preview (2024-09-12)',
+    description: 'Supported in Big-AGI 2. New reasoning model for complex tasks that require broad general knowledge.',
+    contextWindow: 128000,
+    maxCompletionTokens: 32768,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 15, chatOut: 60 },
+    isPreview: true,
+  },
+
+  // o1-mini
+  {
+    idPrefix: 'o1-mini',
+    label: 'o1 Mini',
+    description: 'Supported in Big-AGI 2. Points to the most recent o1-mini snapshot: o1-mini-2024-09-12',
+    symLink: 'o1-mini-2024-09-12',
+    hidden: true,
+    // copied from symlinked
+    contextWindow: 128000,
+    maxCompletionTokens: 65536,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 3, chatOut: 12 },
+    isPreview: true,
+  },
+  {
+    hidden: true, // we can't support it in Big-AGI 1
+    idPrefix: 'o1-mini-2024-09-12',
+    label: 'o1 Mini (2024-09-12)',
+    description: 'Supported in Big-AGI 2. Fast, cost-efficient reasoning model tailored to coding, math, and science use cases.',
+    contextWindow: 128000,
+    maxCompletionTokens: 65536,
+    trainingDataCutoff: 'Oct 2023',
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision],
+    pricing: { chatIn: 3, chatOut: 12 },
+    isPreview: true,
  },

  // GPT4 Turbo with Vision -> 2024-04-09
@@ -52,7 +157,7 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Dec 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 10, chatOut: 30 },
-    benchmark: { cbaElo: 1261 },
+    benchmark: { cbaElo: 1257 },
  },
  {
    idPrefix: 'gpt-4-turbo-2024-04-09',
@@ -63,12 +168,12 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Dec 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 10, chatOut: 30 },
-    benchmark: { cbaElo: 1261 },
+    benchmark: { cbaElo: 1257 },
  },

  // GPT4 Turbo Previews
  {
-    idPrefix: 'gpt-4-turbo-preview', // GPT-4 Turbo preview model -> 0125
+    idPrefix: 'gpt-4-turbo-preview',
    label: 'GPT-4 Preview Turbo',
    description: 'GPT-4 Turbo preview model. Currently points to gpt-4-0125-preview.',
    symLink: 'gpt-4-0125-preview',
@@ -80,63 +185,33 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Dec 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 10, chatOut: 30 },
-    benchmark: { cbaElo: 1251 },
+    benchmark: { cbaElo: 1245 },
  },
  {
-    idPrefix: 'gpt-4-0125-preview', // GPT-4 Turbo preview model
+    idPrefix: 'gpt-4-0125-preview',
    label: 'GPT-4 Turbo (0125)',
-    description: 'GPT-4 Turbo preview model intended to reduce cases of "laziness" where the model doesn\'t complete a task. Returns a maximum of 4,096 output tokens.',
-    isPreview: true,
+    description: 'GPT-4 Turbo preview model intended to reduce cases of "laziness" where the model doesn\'t complete a task.',
    contextWindow: 128000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Dec 2023',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 10, chatOut: 30 },
-    benchmark: { cbaElo: 1251 },
+    benchmark: { cbaElo: 1245 },
    hidden: true,
  },
  {
    idPrefix: 'gpt-4-1106-preview', // GPT-4 Turbo preview model
    label: 'GPT-4 Turbo (1106)',
-    description: 'GPT-4 Turbo preview model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens.',
-    isPreview: true,
+    description: 'GPT-4 Turbo preview model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
    contextWindow: 128000,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Apr 2023',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Json],
    pricing: { chatIn: 10, chatOut: 30 },
-    benchmark: { cbaElo: 1255 },
+    benchmark: { cbaElo: 1251 },
    hidden: true,
  },

-  // GPT4 Vision Previews
-  {
-    idPrefix: 'gpt-4-vision-preview', // GPT-4 Turbo vision preview
-    label: 'GPT-4 Preview Vision',
-    description: 'GPT-4 model with the ability to understand images, in addition to all other GPT-4 Turbo capabilities. This is a preview model, we recommend developers to now use gpt-4-turbo which includes vision capabilities. Currently points to gpt-4-1106-vision-preview.',
-    symLink: 'gpt-4-1106-vision-preview',
-    // copied from symlinked
-    isPreview: true,
-    contextWindow: 128000,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Apr 2023',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 10, chatOut: 30 },
-    hidden: true, // Deprecated in favor of gpt-4-turbo
-  },
-  {
-    idPrefix: 'gpt-4-1106-vision-preview',
-    label: 'GPT-4 Preview Vision (1106)',
-    description: 'GPT-4 model with the ability to understand images, in addition to all other GPT-4 Turbo capabilities. This is a preview model, we recommend developers to now use gpt-4-turbo which includes vision capabilities. Returns a maximum of 4,096 output tokens.',
-    isPreview: true,
-    contextWindow: 128000,
-    maxCompletionTokens: 4096,
-    trainingDataCutoff: 'Apr 2023',
-    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Vision, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 10, chatOut: 30 },
-    hidden: true, // Deprecated in favor of gpt-4-turbo
-  },
-

  // GPT4-32k's
  {
@@ -182,7 +257,7 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Sep 2021',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
    pricing: { chatIn: 30, chatOut: 60 },
-    benchmark: { cbaElo: 1164 },
+    benchmark: { cbaElo: 1161 },
  },
  {
    idPrefix: 'gpt-4-0314',
@@ -192,7 +267,7 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Sep 2021',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
    pricing: { chatIn: 30, chatOut: 60 },
-    benchmark: { cbaElo: 1189 },
+    benchmark: { cbaElo: 1186 },
    hidden: true,
  },
  {
@@ -206,39 +281,27 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Sep 2021',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
    pricing: { chatIn: 30, chatOut: 60 },
-    benchmark: { cbaElo: 1164 },
+    benchmark: { cbaElo: 1161 },
+    isLegacy: true,
  },

-
-  // 3.5-Turbo-Instruct (Not for Chat)
-  {
-    idPrefix: 'gpt-3.5-turbo-instruct',
-    label: '3.5-Turbo Instruct',
-    description: 'Similar capabilities as GPT-3 era models. Compatible with legacy Completions endpoint and not Chat Completions.',
-    contextWindow: 4097,
-    trainingDataCutoff: 'Sep 2021',
-    interfaces: [/* NO: LLM_IF_OAI_Chat,*/ LLM_IF_OAI_Complete],
-    pricing: { chatIn: 1.5, chatOut: 2 },
-    hidden: true,
-  },
-
-
-  // 3.5-Turbo-16k's
+  // 3.5-Turbo
+  // As of July 2024, gpt-4o-mini should be used in place of gpt-3.5-turbo, as it is cheaper, more capable, multimodal, and just as fast.
  {
    idPrefix: 'gpt-3.5-turbo-0125',
    label: '3.5-Turbo (0125)',
-    description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. Returns a maximum of 4,096 output tokens.',
+    description: 'The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.',
    contextWindow: 16385,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Sep 2021',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
    pricing: { chatIn: 0.5, chatOut: 1.5 },
-    benchmark: { cbaElo: 1104 },
+    benchmark: { cbaElo: 1105 },
  },
  {
    idPrefix: 'gpt-3.5-turbo-1106',
    label: '3.5-Turbo (1106)',
-    description: 'The latest GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
+    description: 'GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.',
    contextWindow: 16385,
    maxCompletionTokens: 4096,
    trainingDataCutoff: 'Sep 2021',
@@ -250,7 +313,7 @@ const _knownOpenAIChatModels: ManualMappings = [
  {
    idPrefix: 'gpt-3.5-turbo',
    label: '3.5-Turbo',
-    description: 'Currently points to gpt-3.5-turbo-0125.',
+    description: 'Currently points to gpt-3.5-turbo-0125. As of July 2024, gpt-4o-mini should be used in place of gpt-3.5-turbo, as it is cheaper, more capable, multimodal, and just as fast.',
    symLink: 'gpt-3.5-turbo-0125',
    hidden: true,
    // copied
@@ -259,7 +322,19 @@ const _knownOpenAIChatModels: ManualMappings = [
    trainingDataCutoff: 'Sep 2021',
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
    pricing: { chatIn: 0.5, chatOut: 1.5 },
-    benchmark: { cbaElo: 1104 },
+    benchmark: { cbaElo: 1105 },
+  },
+
+  // 3.5-Turbo-Instruct (Not for Chat)
+  {
+    idPrefix: 'gpt-3.5-turbo-instruct',
+    label: '3.5-Turbo Instruct',
+    description: 'Similar capabilities as GPT-3 era models. Compatible with legacy Completions endpoint and not Chat Completions.',
+    contextWindow: 4097,
+    trainingDataCutoff: 'Sep 2021',
+    interfaces: [/* NO: LLM_IF_OAI_Chat,*/ LLM_IF_OAI_Complete],
+    pricing: { chatIn: 1.5, chatOut: 2 },
+    hidden: true,
  },


@@ -376,8 +451,31 @@ export function localAIModelToModelDescription(modelId: string): ModelDescriptio


 // [Mistral]
+// updated from the models on: https://docs.mistral.ai/getting-started/models/
+// and the pricing available on: https://mistral.ai/technology/#pricing

 const _knownMistralChatModels: ManualMappings = [
+  // Codestral
+  {
+    idPrefix: 'codestral-2405',
+    label: 'Codestral (2405)',
+    description: 'Designed and optimized for code generation tasks.',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+    pricing: { chatIn: 1, chatOut: 3 },
+  },
+  {
+    idPrefix: 'codestral-latest',
+    label: 'Mistral Large (latest)',
+    symLink: 'mistral-codestral-2405',
+    hidden: true,
+    // copied
+    description: 'Designed and optimized for code generation tasks.',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+    pricing: { chatIn: 1, chatOut: 3 },
+  },
+
  // Large
  {
    idPrefix: 'mistral-large-2402',
@@ -385,7 +483,7 @@ const _knownMistralChatModels: ManualMappings = [
    description: 'Top-tier reasoning for high-complexity tasks.',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 8, chatOut: 24 },
+    pricing: { chatIn: 4, chatOut: 12 },
    benchmark: { cbaElo: 1159 },
  },
  {
@@ -397,103 +495,135 @@ const _knownMistralChatModels: ManualMappings = [
    description: 'Top-tier reasoning for high-complexity tasks.',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 8, chatOut: 24 },
+    pricing: { chatIn: 4, chatOut: 12 },
    benchmark: { cbaElo: 1159 },
  },
-  {
-    idPrefix: 'mistral-large',
-    label: 'Mistral Large (?)',
-    description: 'Flagship model, with top-tier reasoning capabilities and language support (English, French, German, Italian, Spanish, and Code)',
-    contextWindow: 32768,
-    interfaces: [LLM_IF_OAI_Chat],
-    hidden: true,
-  },

-  // Medium - not updated on 2024-02-26
+  // Open Mixtral (8x22B)
+  {
+    idPrefix: 'open-mixtral-8x22b-2404',
+    label: 'Open Mixtral 8x22B (2404)',
+    description: 'Mixtral 8x22B model',
+    contextWindow: 65536,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+    pricing: { chatIn: 2, chatOut: 6 },
+  },
+  {
+    idPrefix: 'open-mixtral-8x22b',
+    label: 'Open Mixtral 8x22B',
+    symLink: 'open-mixtral-8x22b-2404',
+    hidden: true,
+    // copied
+    description: 'Mixtral 8x22B model',
+    contextWindow: 65536,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+    pricing: { chatIn: 2, chatOut: 6 },
+  },
+  // Medium (Deprecated)
  {
    idPrefix: 'mistral-medium-2312',
    label: 'Mistral Medium (2312)',
-    description: 'Mistral internal prototype model.',
+    description: 'Ideal for intermediate tasks that require moderate reasoning (Data extraction, Summarizing a Document, Writing emails, Writing a Job Description, or Writing Product Descriptions)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
    pricing: { chatIn: 2.7, chatOut: 8.1 },
    benchmark: { cbaElo: 1148 },
+    isLegacy: true,
+    hidden: true,
  },
  {
    idPrefix: 'mistral-medium-latest',
    label: 'Mistral Medium (latest)',
    symLink: 'mistral-medium-2312',
-    hidden: true,
    // copied
-    description: 'Mistral internal prototype model.',
+    description: 'Ideal for intermediate tasks that require moderate reasoning (Data extraction, Summarizing a Document, Writing emails, Writing a Job Description, or Writing Product Descriptions)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
    pricing: { chatIn: 2.7, chatOut: 8.1 },
    benchmark: { cbaElo: 1148 },
+    isLegacy: true,
+    hidden: true,
  },
  {
    idPrefix: 'mistral-medium',
    label: 'Mistral Medium',
-    description: 'Mistral internal prototype model.',
+    symLink: 'mistral-medium-2312',
+    // copied
+    description: 'Ideal for intermediate tasks that require moderate reasoning (Data extraction, Summarizing a Document, Writing emails, Writing a Job Description, or Writing Product Descriptions)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
    pricing: { chatIn: 2.7, chatOut: 8.1 },
    benchmark: { cbaElo: 1148 },
+    isLegacy: true,
    hidden: true,
  },

-  // Small (8x7B)
+  // Open Mixtral (8x7B) -> currently points to `mistral-small-2312` (as per the docs)
+  {
+    idPrefix: 'open-mixtral-8x7b',
+    label: 'Open Mixtral (8x7B)',
+    description: 'A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM.',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+    pricing: { chatIn: 0.7, chatOut: 0.7 },
+  },
+  // Small (deprecated)
  {
    idPrefix: 'mistral-small-2402',
    label: 'Mistral Small (2402)',
-    description: 'Optimized endpoint. Cost-efficient reasoning for low-latency workloads. Mistral Small outperforms Mixtral 8x7B and has lower latency',
+    description: 'Suitable for simple tasks that one can do in bulk (Classification, Customer Support, or Text Generation)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 2, chatOut: 6 },
-  },
-  {
-    idPrefix: 'mistral-small-2312',
-    label: 'Mistral Small (2312)',
-    description: 'Aka open-mixtral-8x7b. Cost-efficient reasoning for low-latency workloads. Mistral Small outperforms Mixtral 8x7B and has lower latency',
-    contextWindow: 32768,
-    interfaces: [LLM_IF_OAI_Chat],
-    pricing: { chatIn: 2, chatOut: 6 },
+    pricing: { chatIn: 1, chatOut: 3 },
    hidden: true,
+    isLegacy: true,
  },
  {
    idPrefix: 'mistral-small-latest',
    label: 'Mistral Small (latest)',
    symLink: 'mistral-small-2402',
-    hidden: true,
    // copied
-    description: 'Cost-efficient reasoning for low-latency workloads. Mistral Small outperforms Mixtral 8x7B and has lower latency',
+    description: 'Suitable for simple tasks that one can do in bulk (Classification, Customer Support, or Text Generation)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
-    pricing: { chatIn: 2, chatOut: 6 },
+    pricing: { chatIn: 1, chatOut: 3 },
+    hidden: true,
+    isLegacy: true,
+  },
+  {
+    idPrefix: 'mistral-small-2312',
+    label: 'Mistral Small (2312)',
+    description: 'Aka open-mixtral-8x7b. Suitable for simple tasks that one can do in bulk (Classification, Customer Support, or Text Generation)',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+    pricing: { chatIn: 1, chatOut: 3 },
+    hidden: true,
+    isLegacy: true,
  },
  {
    idPrefix: 'mistral-small',
    label: 'Mistral Small',
-    description: 'Cost-efficient reasoning for low-latency workloads.',
-    contextWindow: 32768,
-    interfaces: [LLM_IF_OAI_Chat],
-    pricing: { chatIn: 2, chatOut: 6 },
-    hidden: true,
-  },
-
-  // Open Mixtral (8x7B)
-  {
-    idPrefix: 'open-mixtral-8x7b',
-    label: 'Open Mixtral (8x7B)',
-    description: 'Mixtral 8x7B model, aka mistral-small-2312',
-    // symLink: 'mistral-small-2312',
+    symLink: 'mistral-small-2312',
    // copied
+    description: 'Aka open-mixtral-8x7b. Suitable for simple tasks that one can do in bulk (Classification, Customer Support, or Text Generation)',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
-    pricing: { chatIn: 0.7, chatOut: 0.7 },
+    pricing: { chatIn: 1, chatOut: 3 },
+    hidden: true,
+    isLegacy: true,
  },

-  // Tiny (7B)
+
+  // Open Mistral (7B) -> currently points to mistral-tiny-2312 (as per the docs)
+  {
+    idPrefix: 'open-mistral-7b',
+    label: 'Open Mistral (7B)',
+    description: 'The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters.',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+    pricing: { chatIn: 0.25, chatOut: 0.25 },
+  },
+  // Tiny (deprecated)
  {
    idPrefix: 'mistral-tiny-2312',
    label: 'Mistral Tiny (2312)',
@@ -501,43 +631,34 @@ const _knownMistralChatModels: ManualMappings = [
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
    hidden: true,
+    isLegacy: true,
  },
  {
    idPrefix: 'mistral-tiny',
    label: 'Mistral Tiny',
-    description: 'Used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial',
+    symLink: 'mistral-tiny-2312',
+    // copied
+    description: 'Aka open-mistral-7b. Used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial',
    contextWindow: 32768,
    interfaces: [LLM_IF_OAI_Chat],
    hidden: true,
+    isLegacy: true,
  },
-  // Open Mistral (7B)
-  {
-    idPrefix: 'open-mistral-7b',
-    label: 'Open Mistral (7B)',
-    description: 'Mistral 7B model, aka mistral-tiny-2312',
-    // symLink: 'mistral-tiny-2312',
-    // copied
-    contextWindow: 32768,
-    interfaces: [LLM_IF_OAI_Chat],
-    pricing: { chatIn: 0.25, chatOut: 0.25 },
-  },
-

  {
    idPrefix: 'mistral-embed',
    label: 'Mistral Embed',
-    description: 'State-of-the-art semantic for extracting representation of text extracts.',
-    // output: 1024 dimensions
+    description: 'A model that converts text into numerical vectors of embeddings in 1024 dimensions. Embedding models enable retrieval and retrieval-augmented generation applications.',
    maxCompletionTokens: 1024, // HACK - it's 1024 dimensions, but those are not 'completion tokens'
-    contextWindow: 32768, // actually unknown, assumed from the other models
+    contextWindow: 8192, // Updated context window
    interfaces: [],
+    pricing: { chatIn: 0.1, chatOut: 0.1 },
    hidden: true,
  },
 ];

-
 const mistralModelFamilyOrder = [
-  'mistral-large', 'mistral-medium', 'mistral-small', 'open-mixtral-8x7b', 'mistral-tiny', 'open-mistral-7b', 'mistral-embed', '🔗',
+  'codestral', 'mistral-large', 'open-mixtral-8x22b', 'mistral-medium', 'open-mixtral-8x7b', 'mistral-small', 'open-mistral-7b', 'mistral-tiny', 'mistral-embed', '🔗',
 ];

 export function mistralModelToModelDescription(_model: unknown): ModelDescriptionSchema {
@@ -553,13 +674,13 @@ export function mistralModelToModelDescription(_model: unknown): ModelDescriptio
 }

 export function mistralModelsSort(a: ModelDescriptionSchema, b: ModelDescriptionSchema): number {
+  if (a.label.startsWith('🔗') && !b.label.startsWith('🔗')) return 1;
+  if (!a.label.startsWith('🔗') && b.label.startsWith('🔗')) return -1;
  const aPrefixIndex = mistralModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
  const bPrefixIndex = mistralModelFamilyOrder.findIndex(prefix => b.id.startsWith(prefix));
  if (aPrefixIndex !== -1 && bPrefixIndex !== -1) {
    if (aPrefixIndex !== bPrefixIndex)
      return aPrefixIndex - bPrefixIndex;
-    if (a.label.startsWith('🔗') && !b.label.startsWith('🔗')) return 1;
-    if (!a.label.startsWith('🔗') && b.label.startsWith('🔗')) return -1;
    return b.label.localeCompare(a.label);
  }
  return aPrefixIndex !== -1 ? 1 : -1;
@@ -813,41 +934,84 @@ export function perplexityAIModelSort(a: ModelDescriptionSchema, b: ModelDescrip
 const _knownGroqModels: ManualMappings = [
  {
    isLatest: true,
+    idPrefix: 'llama-3.1-405b-reasoning',
+    label: 'Llama 3.1 · 405B',
+    description: 'LLaMA 3.1 405B developed by Meta with a context window of 131,072 tokens. Supports tool use.',
+    contextWindow: 131072,
+    maxCompletionTokens: 8000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    isLatest: true,
+    idPrefix: 'llama-3.1-70b-versatile',
+    label: 'Llama 3.1 · 70B',
+    description: 'LLaMA 3.1 70B developed by Meta with a context window of 131,072 tokens. Supports tool use.',
+    contextWindow: 131072,
+    maxCompletionTokens: 8000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    isLatest: true,
+    idPrefix: 'llama-3.1-8b-instant',
+    label: 'Llama 3.1 · 8B',
+    description: 'LLaMA 3.1 8B developed by Meta with a context window of 131,072 tokens. Supports tool use.',
+    contextWindow: 131072,
+    maxCompletionTokens: 8000,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    idPrefix: 'llama3-groq-70b-8192-tool-use-preview',
+    label: 'Llama 3 Groq · 70B Tool Use',
+    description: 'LLaMA 3 70B Tool Use developed by Groq with a context window of 8,192 tokens. Optimized for tool use.',
+    contextWindow: 8192,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    idPrefix: 'llama3-groq-8b-8192-tool-use-preview',
+    label: 'Llama 3 Groq · 8B Tool Use',
+    description: 'LLaMA 3 8B Tool Use developed by Groq with a context window of 8,192 tokens. Optimized for tool use.',
+    contextWindow: 8192,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
    idPrefix: 'llama3-70b-8192',
    label: 'Llama 3 · 70B',
-    description: 'LLaMA3 70b developed by Meta with a context window of 8,192 tokens.',
+    description: 'LLaMA3 70B developed by Meta with a context window of 8,192 tokens. Supports tool use.',
    contextWindow: 8192,
-    interfaces: [LLM_IF_OAI_Chat],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+    // isLegacy: true,
+    hidden: true,
  },
  {
-    // isLatest: true,
    idPrefix: 'llama3-8b-8192',
    label: 'Llama 3 · 8B',
-    description: 'LLaMA3 8b developed by Meta with a context window of 8,192 tokens.',
+    description: 'LLaMA3 8B developed by Meta with a context window of 8,192 tokens. Supports tool use.',
    contextWindow: 8192,
-    interfaces: [LLM_IF_OAI_Chat],
-  },
-  {
-    idPrefix: 'llama2-70b-4096',
-    label: 'Llama 2 · 70B',
-    description: 'LLaMA2 70b developed by Meta with a context window of 4,096 tokens.',
-    contextWindow: 4096,
-    interfaces: [LLM_IF_OAI_Chat],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+    // isLegacy: true,
    hidden: true,
  },
  {
    idPrefix: 'mixtral-8x7b-32768',
    label: 'Mixtral 8x7B',
-    description: 'Mixtral 8x7b developed by Mistral with a context window of 32,768 tokens.',
+    description: 'Mixtral 8x7B developed by Mistral with a context window of 32,768 tokens. Supports tool use.',
    contextWindow: 32768,
-    interfaces: [LLM_IF_OAI_Chat],
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
+  },
+  {
+    idPrefix: 'gemma2-9b-it',
+    label: 'Gemma 2 · 9B Instruct',
+    description: 'Gemma 2 9B developed by Google with a context window of 8,192 tokens. Supports tool use.',
+    contextWindow: 8192,
+    interfaces: [LLM_IF_OAI_Chat, LLM_IF_OAI_Fn],
  },
  {
    idPrefix: 'gemma-7b-it',
    label: 'Gemma 1.1 · 7B Instruct',
-    description: 'Gemma 7b developed by Google with a context window of 8,192 tokens.',
+    description: 'Gemma 7B developed by Google with a context window of 8,192 tokens. Supports tool use.',
    contextWindow: 8192,
    interfaces: [LLM_IF_OAI_Chat],
+    hidden: true,
  },
 ];

@@ -864,6 +1028,11 @@ export function groqModelToModelDescription(_model: unknown): ModelDescriptionSc
 }

 export function groqModelSortFn(a: ModelDescriptionSchema, b: ModelDescriptionSchema): number {
+  // sort hidden at the end
+  if (a.hidden && !b.hidden)
+    return 1;
+  if (!a.hidden && b.hidden)
+    return -1;
  // sort as per their order in the known models
  const aIndex = _knownGroqModels.findIndex(base => a.id.startsWith(base.idPrefix));
  const bIndex = _knownGroqModels.findIndex(base => b.id.startsWith(base.idPrefix));
@@ -875,7 +1044,13 @@ export function groqModelSortFn(a: ModelDescriptionSchema, b: ModelDescriptionSc

 // Helpers

-type ManualMapping = ({ idPrefix: string, isLatest?: boolean, isPreview?: boolean, isLegacy?: boolean, symLink?: string } & Omit<ModelDescriptionSchema, 'id' | 'created' | 'updated'>);
+type ManualMapping = ({
+  idPrefix: string,
+  isLatest?: boolean,
+  isPreview?: boolean,
+  isLegacy?: boolean,
+  symLink?: string
+} & Omit<ModelDescriptionSchema, 'id' | 'created' | 'updated'>);
 type ManualMappings = ManualMapping[];

 function fromManualMapping(mappings: ManualMappings, id: string, created?: number, updated?: number, fallback?: ManualMapping): ModelDescriptionSchema {
@@ -12,7 +12,7 @@ import { fixupHost } from '~/common/util/urlUtils';

 import { OpenAIWire, WireOpenAICreateImageOutput, wireOpenAICreateImageOutputSchema, WireOpenAICreateImageRequest } from './openai.wiretypes';
 import { azureModelToModelDescription, groqModelSortFn, groqModelToModelDescription, lmStudioModelToModelDescription, localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelFilter, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription, perplexityAIModelDescriptions, perplexityAIModelSort, togetherAIModelsToModelDescriptions } from './models.data';
-import { llmsChatGenerateWithFunctionsOutputSchema, llmsListModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
+import { llmsChatGenerateWithFunctionsOutputSchema, llmsGenerateContextSchema, llmsListModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
 import { wilreLocalAIModelsApplyOutputSchema, wireLocalAIModelsAvailableOutputSchema, wireLocalAIModelsListOutputSchema } from './localai.wiretypes';


@@ -72,8 +72,11 @@ const listModelsInputSchema = z.object({

 const chatGenerateWithFunctionsInputSchema = z.object({
  access: openAIAccessSchema,
-  model: openAIModelSchema, history: openAIHistorySchema,
-  functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
+  functions: openAIFunctionsSchema.optional(),
+  forceFunctionName: z.string().optional(),
+  context: llmsGenerateContextSchema.optional(),
 });

 const createImagesInputSchema = z.object({
@@ -108,7 +111,7 @@ export const llmOpenAIRouter = createTRPCRouter({

      // [Azure]: use an older 'deployments' API to enumerate the models, and a modified OpenAI id to description mapping
      if (access.dialect === 'azure') {
-        const azureModels = await openaiGET(access, `/openai/deployments?api-version=2023-03-15-preview`);
+        const azureModels = await openaiGETOrThrow(access, `/openai/deployments?api-version=2023-03-15-preview`);

        const wireAzureListDeploymentsSchema = z.object({
          data: z.array(z.object({
@@ -146,7 +149,7 @@ export const llmOpenAIRouter = createTRPCRouter({


      // [non-Azure]: fetch openAI-style for all but Azure (will be then used in each dialect)
-      const openAIWireModelsResponse = await openaiGET<OpenAIWire.Models.Response>(access, '/v1/models');
+      const openAIWireModelsResponse = await openaiGETOrThrow<OpenAIWire.Models.Response>(access, '/v1/models');

      // [Together] missing the .data property
      if (access.dialect === 'togetherai')
@@ -267,17 +270,22 @@ export const llmOpenAIRouter = createTRPCRouter({
    .output(llmsChatGenerateWithFunctionsOutputSchema)
    .mutation(async ({ input }) => {

-      const { access, model, history, functions, forceFunctionName } = input;
+      const { access, model, history, functions, forceFunctionName, context } = input;
      const isFunctionsCall = !!functions && functions.length > 0;

      const completionsBody = openAIChatCompletionPayload(access.dialect, model, history, isFunctionsCall ? functions : null, forceFunctionName ?? null, 1, false);
-      const wireCompletions = await openaiPOST<OpenAIWire.ChatCompletion.Response, OpenAIWire.ChatCompletion.Request>(
+      const wireCompletions = await openaiPOSTOrThrow<OpenAIWire.ChatCompletion.Response, OpenAIWire.ChatCompletion.Request>(
        access, model.id, completionsBody, '/v1/chat/completions',
      );

      // expect a single output
-      if (wireCompletions?.choices?.length !== 1)
-        throw new TRPCError({ code: 'INTERNAL_SERVER_ERROR', message: `[OpenAI Issue] Expected 1 completion, got ${wireCompletions?.choices?.length}` });
+      if (wireCompletions?.choices?.length !== 1) {
+        console.error(`[POST] llmOpenAI.chatGenerateWithFunctions: ${access.dialect}: ${context?.name || 'no context'}: unexpected output${forceFunctionName ? ` (fn: ${forceFunctionName})` : ''}:`, model.id, wireCompletions?.choices);
+        throw new TRPCError({
+          code: 'UNPROCESSABLE_CONTENT',
+          message: `[OpenAI Issue] Expected 1 completion, got ${wireCompletions?.choices?.length}`,
+        });
+      }
      let { message, finish_reason } = wireCompletions.choices[0];

      // LocalAI hack/workaround, until https://github.com/go-skynet/LocalAI/issues/788 is fixed
@@ -318,7 +326,7 @@ export const llmOpenAIRouter = createTRPCRouter({
        delete requestBody.response_format;

      // create 1 image (dall-e-3 won't support more than 1, so better transfer the burden to the client)
-      const wireOpenAICreateImageOutput = await openaiPOST<WireOpenAICreateImageOutput, WireOpenAICreateImageRequest>(
+      const wireOpenAICreateImageOutput = await openaiPOSTOrThrow<WireOpenAICreateImageOutput, WireOpenAICreateImageRequest>(
        access, null, requestBody, '/v1/images/generations',
      );

@@ -340,7 +348,7 @@ export const llmOpenAIRouter = createTRPCRouter({
    .mutation(async ({ input: { access, text } }): Promise<OpenAIWire.Moderation.Response> => {
      try {

-        return await openaiPOST<OpenAIWire.Moderation.Response, OpenAIWire.Moderation.Request>(access, null, {
+        return await openaiPOSTOrThrow<OpenAIWire.Moderation.Response, OpenAIWire.Moderation.Request>(access, null, {
          input: text,
          model: 'text-moderation-latest',
        }, '/v1/moderations');
@@ -361,7 +369,7 @@ export const llmOpenAIRouter = createTRPCRouter({
  dialectLocalAI_galleryModelsAvailable: publicProcedure
    .input(listModelsInputSchema)
    .query(async ({ input: { access } }) => {
-      const wireLocalAIModelsAvailable = await openaiGET(access, '/models/available');
+      const wireLocalAIModelsAvailable = await openaiGETOrThrow(access, '/models/available');
      return wireLocalAIModelsAvailableOutputSchema.parse(wireLocalAIModelsAvailable);
    }),

@@ -374,7 +382,7 @@ export const llmOpenAIRouter = createTRPCRouter({
    }))
    .mutation(async ({ input: { access, galleryName, modelName } }) => {
      const galleryModelId = `${galleryName}@${modelName}`;
-      const wireLocalAIModelApply = await openaiPOST(access, null, { id: galleryModelId }, '/models/apply');
+      const wireLocalAIModelApply = await openaiPOSTOrThrow(access, null, { id: galleryModelId }, '/models/apply');
      return wilreLocalAIModelsApplyOutputSchema.parse(wireLocalAIModelApply);
    }),

@@ -385,7 +393,7 @@ export const llmOpenAIRouter = createTRPCRouter({
      jobId: z.string(),
    }))
    .query(async ({ input: { access, jobId } }) => {
-      const wireLocalAIModelsJobs = await openaiGET(access, `/models/jobs/${jobId}`);
+      const wireLocalAIModelsJobs = await openaiGETOrThrow(access, `/models/jobs/${jobId}`);
      return wireLocalAIModelsListOutputSchema.parse(wireLocalAIModelsJobs);
    }),

@@ -623,12 +631,12 @@ export function openAIChatCompletionPayload(dialect: OpenAIDialects, model: Open
  };
 }

-async function openaiGET<TOut extends object>(access: OpenAIAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+async function openaiGETOrThrow<TOut extends object>(access: OpenAIAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
  const { headers, url } = openAIAccess(access, null, apiPath);
  return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, `OpenAI/${access.dialect}`);
 }

-async function openaiPOST<TOut extends object, TPostBody extends object>(access: OpenAIAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+async function openaiPOSTOrThrow<TOut extends object, TPostBody extends object>(access: OpenAIAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
  const { headers, url } = openAIAccess(access, modelRefId, apiPath);
  return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, `OpenAI/${access.dialect}`);
 }
@@ -8,7 +8,7 @@ import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
 import type { ModelDescriptionSchema } from '../server/llm.server.types';
 import type { ModelVendorId } from './vendors.registry';
 import type { StreamingClientUpdate } from './unifiedStreamingClient';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../llm.client';
+import type { VChatContextRef, VChatFunctionIn, VChatGenerateContextName, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut, VChatStreamContextName } from '../llm.client';


 export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
@@ -44,6 +44,7 @@ export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOpt
    access: TAccess,
    llmOptions: TLLMOptions,
    messages: VChatMessageIn[],
+    contextName: VChatGenerateContextName, contextRef: VChatContextRef | null,
    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
    maxTokens?: number,
  ) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
@@ -53,6 +54,7 @@ export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOpt
    llmId: DLLMId,
    llmOptions: TLLMOptions,
    messages: VChatMessageIn[],
+    contextName: VChatStreamContextName, contextRef: VChatContextRef,
    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
    abortSignal: AbortSignal,
    onUpdate: (update: StreamingClientUpdate, done: boolean) => void,
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Alert } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { ExternalLink } from '~/common/components/ExternalLink';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormTextField } from '~/common/components/forms/FormTextField';
@@ -49,7 +50,7 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='anthropic-key' label={!!anthropicHost ? 'API Key' : 'Anthropic API Key'}
      rightLabel={<>{needsUserKey
        ? !anthropicKey && <Link level='body-sm' href='https://www.anthropic.com/earlyaccess' target='_blank'>request Key</Link>
-        : '✔️ already set in server'
+        : <AlreadySet />
      } {anthropicKey && keyValid && <Link level='body-sm' href='https://console.anthropic.com/settings/usage' target='_blank'>show tokens usage</Link>}
      </>}
      value={anthropicKey} onChange={value => updateSetup({ anthropicKey: value })}
@@ -3,7 +3,7 @@ import { apiAsync } from '~/common/util/trpc.client';

 import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
 import type { IModelVendor } from '../IModelVendor';
-import type { VChatMessageOut } from '../../llm.client';
+import type { VChatContextRef, VChatGenerateContextName, VChatMessageOut } from '../../llm.client';
 import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { FALLBACK_LLM_RESPONSE_TOKENS, FALLBACK_LLM_TEMPERATURE, LLMOptionsOpenAI } from '../openai/openai.vendor';
@@ -47,7 +47,7 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicA
  rpcUpdateModelsOrThrow: async (access) => await apiAsync.llmAnthropic.listModels.query({ access }),

  // Chat Generate (non-streaming) with Functions
-  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, contextName: VChatGenerateContextName, contextRef: VChatContextRef | null, functions, forceFunctionName, maxTokens) => {
    if (functions?.length || forceFunctionName)
      throw new Error('Anthropic does not support functions');

@@ -61,6 +61,11 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicA
          maxTokens: maxTokens || llmResponseTokens || FALLBACK_LLM_RESPONSE_TOKENS,
        },
        history: messages,
+        context: contextRef ? {
+          method: 'chat-generate',
+          name: contextName,
+          ref: contextRef,
+        } : undefined,
      }) as VChatMessageOut;
    } catch (error: any) {
      const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
@@ -1,5 +1,6 @@
 import * as React from 'react';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
@@ -49,7 +50,7 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='azure-key' label='Azure Key'
      rightLabel={<>{needsUserKey
        ? !azureKey && <Link level='body-sm' href='https://azure.microsoft.com/en-us/products/ai-services/openai-service' target='_blank'>request Key</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={azureKey} onChange={value => updateSetup({ azureKey: value })}
      required={needsUserKey} isError={keyError}
@@ -3,6 +3,7 @@ import * as React from 'react';
 import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
 import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { InlineError } from '~/common/components/InlineError';
@@ -50,7 +51,7 @@ export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='gemini-key' label='Gemini API Key'
      rightLabel={<>{needsUserKey
        ? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
      required={needsUserKey} isError={showKeyError}
@@ -1,10 +1,10 @@
 import { GeminiIcon } from '~/common/components/icons/vendors/GeminiIcon';
- import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync } from '~/common/util/trpc.client';

 import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
 import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
 import type { IModelVendor } from '../IModelVendor';
-import type { VChatMessageOut } from '../../llm.client';
+import type { VChatContextRef, VChatGenerateContextName, VChatMessageOut } from '../../llm.client';
 import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { FALLBACK_LLM_RESPONSE_TOKENS, FALLBACK_LLM_TEMPERATURE } from '../openai/openai.vendor';
@@ -60,7 +60,7 @@ export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSche
  rpcUpdateModelsOrThrow: async (access) => await apiAsync.llmGemini.listModels.query({ access }),

  // Chat Generate (non-streaming) with Functions
-  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, contextName: VChatGenerateContextName, contextRef: VChatContextRef | null, functions, forceFunctionName, maxTokens) => {
    if (functions?.length || forceFunctionName)
      throw new Error('Gemini does not support functions');

@@ -74,6 +74,11 @@ export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSche
          maxTokens: maxTokens || maxOutputTokens || FALLBACK_LLM_RESPONSE_TOKENS,
        },
        history: messages,
+        context: contextRef ? {
+          method: 'chat-generate',
+          name: contextName,
+          ref: contextRef,
+        } : undefined,
      }) as VChatMessageOut;
    } catch (error: any) {
      const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
@@ -42,7 +43,7 @@ export function GroqSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='groq-key' label='Groq API Key'
      rightLabel={<>{needsUserKey
        ? !groqKey && <Link level='body-sm' href={GROQ_REG_LINK} target='_blank'>API keys</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={groqKey} onChange={value => updateSetup({ groqKey: value })}
      required={needsUserKey} isError={showKeyError}
@@ -6,6 +6,7 @@ import CheckBoxOutlinedIcon from '@mui/icons-material/CheckBoxOutlined';

 import { getBackendCapabilities } from '~/modules/backend/store-backend-capabilities';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { ExpanderAccordion } from '~/common/components/ExpanderAccordion';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
@@ -81,7 +82,7 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
      noKey
      required={userHostRequired}
      isError={userHostError}
-      rightLabel={backendHasHost ? '✔️ already set in server' : <Link level='body-sm' href='https://localai.io' target='_blank'>Learn more</Link>}
+      rightLabel={backendHasHost ? <AlreadySet /> : <Link level='body-sm' href='https://localai.io' target='_blank'>Learn more</Link>}
      value={localAIHost} onChange={value => updateSetup({ localAIHost: value })}
    />

@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
@@ -39,7 +40,7 @@ export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='mistral-key' label='Mistral Key'
      rightLabel={<>{needsUserKey
        ? !mistralKey && <Link level='body-sm' href={MISTRAL_REG_LINK} target='_blank'>request Key</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={mistralKey} onChange={value => updateSetup({ oaiKey: value })}
      required={needsUserKey} isError={showKeyError}
@@ -3,7 +3,7 @@ import { apiAsync } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
 import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
-import type { VChatMessageOut } from '../../llm.client';
+import type { VChatContextRef, VChatGenerateContextName, VChatMessageOut } from '../../llm.client';
 import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { FALLBACK_LLM_RESPONSE_TOKENS, FALLBACK_LLM_TEMPERATURE, LLMOptionsOpenAI } from '../openai/openai.vendor';
@@ -42,7 +42,7 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSche
  rpcUpdateModelsOrThrow: async (access) => await apiAsync.llmOllama.listModels.query({ access }),

  // Chat Generate (non-streaming) with Functions
-  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, contextName: VChatGenerateContextName, contextRef: VChatContextRef | null, functions, forceFunctionName, maxTokens) => {
    if (functions?.length || forceFunctionName)
      throw new Error('Ollama does not support functions');

@@ -56,6 +56,11 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSche
          maxTokens: maxTokens || llmResponseTokens || FALLBACK_LLM_RESPONSE_TOKENS,
        },
        history: messages,
+        context: contextRef ? {
+          method: 'chat-generate',
+          name: contextName,
+          ref: contextRef,
+        } : undefined,
      }) as VChatMessageOut;
    } catch (error: any) {
      const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Alert } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { Brand } from '~/common/app.config';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormSwitchControl } from '~/common/components/forms/FormSwitchControl';
@@ -48,7 +49,7 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='openai-key' label='API Key'
      rightLabel={<>{needsUserKey
        ? !oaiKey && <><Link level='body-sm' href='https://platform.openai.com/account/api-keys' target='_blank'>create key</Link> and <Link level='body-sm' href='https://openai.com/waitlist/gpt-4-api' target='_blank'>request access to GPT-4</Link></>
-        : '✔️ already set in server'
+        : <AlreadySet />
      } {oaiKey && keyValid && <Link level='body-sm' href='https://platform.openai.com/account/usage' target='_blank'>check usage</Link>}
      </>}
      value={oaiKey} onChange={value => updateSetup({ oaiKey: value })}
@@ -3,7 +3,7 @@ import { apiAsync } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
 import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
-import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
+import type { VChatContextRef, VChatGenerateContextName, VChatMessageOrFunctionCallOut } from '../../llm.client';
 import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { OpenAILLMOptions } from './OpenAILLMOptions';
@@ -60,7 +60,7 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSche
  rpcUpdateModelsOrThrow: async (access) => await apiAsync.llmOpenAI.listModels.query({ access }),

  // Chat Generate (non-streaming) with Functions
-  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, contextName: VChatGenerateContextName, contextRef: VChatContextRef | null, functions, forceFunctionName, maxTokens) => {
    const { llmRef, llmTemperature, llmResponseTokens } = llmOptions;
    try {
      return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
@@ -73,6 +73,11 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSche
        functions: functions ?? undefined,
        forceFunctionName: forceFunctionName ?? undefined,
        history: messages,
+        context: contextRef ? {
+          method: 'chat-generate',
+          name: contextName,
+          ref: contextRef,
+        } : undefined,
      }) as VChatMessageOrFunctionCallOut;
    } catch (error: any) {
      const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Button, Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
@@ -56,7 +57,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='openrouter-key' label='OpenRouter API Key'
      rightLabel={<>{needsUserKey
        ? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
-        : '✔️ already set in server'
+        : <AlreadySet />
      } {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
      </>}
      value={oaiKey} onChange={value => updateSetup({ oaiKey: value })}
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
@@ -42,7 +43,7 @@ export function PerplexitySourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='perplexity-key' label='Perplexity API Key'
      rightLabel={<>{needsUserKey
        ? !perplexityKey && <Link level='body-sm' href={PERPLEXITY_REG_LINK} target='_blank'>API keys</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={perplexityKey} onChange={value => updateSetup({ perplexityKey: value })}
      required={needsUserKey} isError={showKeyError}
@@ -2,6 +2,7 @@ import * as React from 'react';

 import { Alert, Typography } from '@mui/joy';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormSwitchControl } from '~/common/components/forms/FormSwitchControl';
 import { InlineError } from '~/common/components/InlineError';
@@ -48,7 +49,7 @@ export function TogetherAISourceSetup(props: { sourceId: DModelSourceId }) {
      autoCompleteId='togetherai-key' label='Together AI Key'
      rightLabel={<>{needsUserKey
        ? !togetherKey && <Link level='body-sm' href={TOGETHERAI_REG_LINK} target='_blank'>request Key</Link>
-        : '✔️ already set in server'}
+        : <AlreadySet />}
      </>}
      value={togetherKey} onChange={value => updateSetup({ togetherKey: value })}
      required={needsUserKey} isError={showKeyError}
@@ -3,7 +3,7 @@ import { frontendSideFetch } from '~/common/util/clientFetchers';

 import type { ChatStreamingInputSchema, ChatStreamingPreambleModelSchema, ChatStreamingPreambleStartSchema } from '../server/llm.server.streaming';
 import type { DLLMId } from '../store-llms';
-import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';
+import type { VChatContextRef, VChatFunctionIn, VChatMessageIn, VChatStreamContextName } from '../llm.client';

 import type { OpenAIAccessSchema } from '../server/openai/openai.router';
 import type { OpenAIWire } from '../server/openai/openai.wiretypes';
@@ -29,6 +29,7 @@ export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions
  llmId: DLLMId,
  llmOptions: TLLMOptions,
  messages: VChatMessageIn[],
+  contextName: VChatStreamContextName, contextRef: VChatContextRef,
  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
  abortSignal: AbortSignal,
  onUpdate: (update: StreamingClientUpdate, done: boolean) => void,
@@ -55,6 +56,11 @@ export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions
      ...(llmResponseTokens ? { maxTokens: llmResponseTokens } : {}),
    },
    history: messages,
+    context: {
+      method: 'chat-stream',
+      name: contextName, // this errors if the client VChatContextName mismatches the server z.enum
+      ref: contextRef,
+    },
  };

  // connect to the server-side streaming endpoint
@@ -10,6 +10,7 @@ import StayPrimaryPortraitIcon from '@mui/icons-material/StayPrimaryPortrait';

 import { getBackendCapabilities } from '~/modules/backend/store-backend-capabilities';

+import { AlreadySet } from '~/common/components/AlreadySet';
 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { FormRadioControl } from '~/common/components/forms/FormRadioControl';
@@ -80,7 +81,7 @@ export function ProdiaSettings(props: { noSkipKey?: boolean }) {

    {!backendHasProdia && !!props.noSkipKey && <FormInputKey
      autoCompleteId='prodia-key' label='Prodia API Key'
-      rightLabel={backendHasProdia ? '✔️ already set in server' : 'required'}
+      rightLabel={<AlreadySet required={!backendHasProdia} />}
      value={apiKey} onChange={setApiKey}
      required={!backendHasProdia} isError={!isValidKey}
    />}
@@ -1,7 +1,7 @@
 {
  "functions": {
-    "app/api/trpc-node/**/*": {
-      "maxDuration": 25
+    "api/trpc-node/**/*": {
+      "maxDuration": 30
    }
  }
 }
Author	SHA1	Message	Date
Enrico Ros	9bac46ea75	1.16.9 Release	2025-01-21 18:09:37 -08:00
Enrico Ros	2af4ee7dbe	Remove v1-dev, fully absorbed into v2-dev.	2025-01-21 18:05:18 -08:00
Enrico Ros	590fc0d021	Gemini: relax parser - Fixes #700	2024-12-19 01:09:40 -08:00
Enrico Ros	746b0dad40	Update Node to 22	2024-12-19 01:08:41 -08:00
Enrico Ros	b327da3ded	Fix #675 (pre-v2)	2024-11-06 16:37:18 -08:00
Enrico Ros	7a818bdcd0	Update branch names	2024-10-28 20:09:53 -07:00
Enrico Ros	c92ee2e22a	v1: document branch names	2024-10-28 20:02:18 -07:00
Enrico Ros	632a4a565f	[stable] OpenAI: update models	2024-10-25 10:13:13 -07:00
Enrico Ros	d712c275a0	[stable] Anthropic: update models	2024-10-25 10:06:42 -07:00
Enrico Ros	1adff7481b	Dev survey for Big-AGI 2.	2024-10-11 21:55:46 -07:00
Enrico Ros	393e19dda9	Vercel: fix timeout	2024-10-03 12:37:21 -07:00
Enrico Ros	39c5c7c9ba	Call out to Big-AGI 2	2024-09-13 14:06:12 -07:00
Enrico Ros	e64a5e59ef	1.16.8 Release	2024-09-13 13:50:00 -07:00
Enrico Ros	574c2cf0e3	Call out to Big-AGI 2	2024-09-13 13:49:11 -07:00
Enrico Ros	1d3321b336	OpenAI: o1 support label	2024-09-13 11:02:33 -07:00
Enrico Ros	de25e5822d	OpenAI: o1 relabel	2024-09-13 10:59:40 -07:00
Enrico Ros	6a904c9f37	OpenAI: 3.5 non legacy	2024-09-13 10:59:32 -07:00
Enrico Ros	30c3283572	OpenAI: add o1	2024-09-13 10:53:42 -07:00
Enrico Ros	10bba19079	OpenAI: add ChatGPT-4o-latest	2024-09-13 10:53:32 -07:00
Enrico Ros	713079f2f2	OpenAI: bits	2024-09-13 10:53:20 -07:00
Enrico Ros	6e16e989ac	OpenAI: move 4o-mini	2024-09-13 10:53:09 -07:00
Enrico Ros	4e89e0b1e4	OpenAI: clean IDs	2024-09-13 10:52:19 -07:00
Enrico Ros	6067c289ab	OpenAI: remove vision previews	2024-09-13 10:52:00 -07:00
Enrico Ros	32ebfea9cb	OpenAI: reorder	2024-09-13 10:20:52 -07:00
Enrico Ros	dec280d54d	1.16.7 Release (cherry picked from commit `22b32d571d`)	2024-08-07 02:51:59 -07:00
Enrico Ros	4823e97783	Mapping doc, for the future. (cherry picked from commit `a416cafc4e`)	2024-08-07 02:51:59 -07:00
Enrico Ros	6a5685995f	OpenAI: update models (cherry picked from commit `5f5efe6133`)	2024-08-07 02:51:59 -07:00
Enrico Ros	3b4d5691d7	1.16.6: Release. Fixes #604	2024-07-24 21:31:57 -07:00
Enrico Ros	45c09d021a	Groq: update output tokens (max 8,000 for 3.1)	2024-07-24 21:27:20 -07:00
Enrico Ros	8ef759fe0f	Groq: update Models	2024-07-24 21:27:12 -07:00
Enrico Ros	c06735fdd2	1.16.5: Release	2024-07-18 16:15:53 -07:00
Enrico Ros	cf4297a1af	OpenAI: support 4o Mini (16384 token output)	2024-07-18 16:15:37 -07:00
Enrico Ros	5d458d68bd	Warn devs.	2024-07-18 16:12:17 -07:00
Enrico Ros	c3db077ae8	1.16.4: release	2024-07-15 14:13:36 -07:00
Enrico Ros	779b265b20	Anthropic: 8192 tokens	2024-07-15 14:08:02 -07:00
Enrico Ros	7d6d7e619b	Anthropic: hardcode date	2024-06-20 12:42:10 -07:00
Enrico Ros	34caa16e39	1.16.3: release	2024-06-20 12:27:42 -07:00
Enrico Ros	976426dbd3	Anthropic: support Claude 3.5 Sonnet	2024-06-20 12:27:26 -07:00
Enrico Ros	b4d8e39d56	Gemini: acknowledge the new capability to `createCachedContent`. Fixes #565	2024-06-10 23:56:02 -07:00
Enrico Ros	11c41e7381	Function call: increase debug verbosity	2024-06-07 14:18:01 -07:00
Enrico Ros	358d8a54ff	Increase llms alignment before function calling.	2024-06-07 14:11:36 -07:00
Enrico Ros	3c8fedce68	Highlight issues with chatGenerateWithFunctions	2024-06-07 12:38:21 -07:00
Enrico Ros	1744b5b9d0	Throw if function calling on a model that doesn't support it	2024-06-07 12:15:25 -07:00
Enrico Ros	0c15476dd2	1.16.2: release	2024-06-06 22:10:27 -07:00
Enrico Ros	94ef76c67e	Gemini: update (cherry picked from commit `3050b546ac`)	2024-06-06 21:42:47 -07:00
Enrico Ros	bd5bf6f94f	Gemini: update (cherry picked from commit `1429726ba6`)	2024-06-06 21:42:47 -07:00
Enrico Ros	1fbf454c3c	Add Codestral - Fixes #558 (cherry picked from commit `4075581acd`)	2024-06-06 21:42:47 -07:00
Enrico Ros	07b62fe5c1	Streaming uplink: index sources for unification.	2024-06-06 21:42:47 -07:00
Enrico Ros	7fbf6ee2e8	Fix Domino issue (crash) by upgrading Turndown to 7.2.0 See: https://github.com/mixmark-io/turndown/issues/439 https://github.com/fgnass/domino/issues/146 (cherry picked from commit `baad3ae1c3`)	2024-06-06 21:41:04 -07:00
Enrico Ros	ba66fc30c5	Fix TimeoutError issue (cherry picked from commit `7c099cab94`)	2024-06-06 21:41:04 -07:00
Enrico Ros	45b7ed3220	Mistral: update pricing (cherry picked from commit `05aa4b547f`)	2024-06-06 21:41:04 -07:00
Enrico Ros	20f1c4c0ae	Mistral: update #518 (cherry picked from commit `6afb61d25d`)	2024-06-06 21:41:04 -07:00
Enrico Ros	97b6fc5e2b	Already Set (cherry picked from commit `a7ce5c1ca6`)	2024-06-06 21:41:04 -07:00
Enrico Ros	44d8c30187	Start opened (cherry picked from commit `952bd2bd93`)	2024-06-06 21:41:04 -07:00
Enrico Ros	e3957bf08b	Page download: improve (cherry picked from commit `f9d33d4888`)	2024-06-06 21:41:03 -07:00
Enrico Ros	acfe0aba21	Beam: bits (cherry picked from commit `81d99f19d4`)	2024-06-06 21:41:03 -07:00
Enrico Ros	6247b5411b	Beam: recall importing rays (cherry picked from commit `454a4257da`)	2024-06-06 21:41:03 -07:00
Enrico Ros	5cc0b0a011	Beam: fix reactive bug (cherry picked from commit `e513b42786`)	2024-06-06 21:41:03 -07:00
Enrico Ros	1fed2fb18c	Beam: if auto-start, give the chance to change merge model (cherry picked from commit `b607e3c034`)	2024-06-06 21:41:03 -07:00
Enrico Ros	8a0e7a4e3d	Tiktoken: in the future, show tokens (cherry picked from commit `d5c3f5012b`)	2024-06-06 21:41:03 -07:00
Enrico Ros	29a784c6c6	Update TikToken for perfect token computation on 'o' models. (cherry picked from commit `21d045be59`)	2024-06-06 21:41:03 -07:00
Enrico Ros	409a3ee194	DChat: remove IDB migration (cherry picked from commit `44ab0483b6`)	2024-06-06 21:41:03 -07:00
Enrico Ros	54caa3e01a	Gemini: improve support (incl. interfaces, cost, visibility) (cherry picked from commit `9eb0cc0b62`)	2024-06-06 21:41:03 -07:00
Enrico Ros	e1a723a39f	(bits) (cherry picked from commit `2db74867f5`)	2024-06-06 21:41:03 -07:00
Enrico Ros	463ea35d7c	Default to the full context window (cherry picked from commit `fd30baafb8`)	2024-06-06 21:41:03 -07:00
Enrico Ros	f751c91c68	Browse: improve markdown transform (cherry picked from commit `3623eef47f`)	2024-06-06 21:41:03 -07:00
Enrico Ros	ad24c8771a	Browse: full support for markdown transform (cherry picked from commit `7b07bb7884`)	2024-06-06 21:41:03 -07:00
Enrico Ros	6f82e2c3ed	Browse: markdown transform as default (cherry picked from commit `7946cd6614`)	2024-06-06 21:41:03 -07:00
Enrico Ros	f4b39071f0	Browse: support transform (skel) (cherry picked from commit `51b6e30986`)	2024-06-06 21:41:03 -07:00
Enrico Ros	621c968f3f	Hold Shift to delete without confirmation: fixes #537 (cherry picked from commit `002df7b0f9`)	2024-06-06 21:41:03 -07:00