Merge branch 'release-1.8.0'

Maintainers Release
1.8.0: Readme and Changelog
2026-05-10 21:50:14 -07:00 · 2023-12-20 02:38:51 -08:00 · 2023-12-20 02:32:47 -08:00 · 2023-12-20 02:11:13 -08:00 · 2023-12-20 01:54:23 -08:00 · 2023-12-20 01:11:08 -08:00
85 changed files with 1829 additions and 833 deletions
@@ -65,7 +65,11 @@ I need the following from you:

 ### GitHub release

-Now paste the former release (or 1.5.0 which was accurate and great), including the new contributors and
+```markdown
+Please create the 1.2.3 Release Notes for GitHub. The following were the Release Notes for 1.1.0. Use a truthful and honest tone, undestanding that people's time and attention span is short. Today is 2023-12-20.
+```
+
+Now paste-attachment the former release notes (or 1.5.0 which was accurate and great), including the new contributors and
 some stats (# of commits, etc.), and roll it for the new release.

 ### Discord announcement
@@ -1,7 +1,7 @@
 # BIG-AGI 🧠✨

 Welcome to big-AGI 👋, the GPT application for professionals that need function, form,
-simplicity, and speed. Powered by the latest models from 7 vendors and
+simplicity, and speed. Powered by the latest models from 8 vendors and
 open-source model servers, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
 visualizations, coding, drawing, calling, and quite more -- all in a polished UX.

@@ -11,7 +11,7 @@ Pros use big-AGI. 🚀 Developers love big-AGI. 🤖

 Or fork & run on Vercel

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
+[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)

 ## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2)

@@ -21,7 +21,19 @@ shows the current developments and future ideas.
 - Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
 - Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_

-### What's New in 1.7.3 · Dec 13, 2023 · Attachment Theory 🌟
+### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
+
+- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
+- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
+- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
+- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
+- Mac Shortcuts Fix: Improved UX on Mac
+- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
+- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
+- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
+- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
+
+### What's New in 1.7.0 · Dec 11, 2023

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -31,9 +43,6 @@ shows the current developments and future ideas.
 - Optimized Voice Input and Performance
 - Latest Ollama and Oobabooga models
 - For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
- [1.7.2]: OpenRouter login & free models 🎁
- [1.7.3]: Mistral Platform support. [#273](https://github.com/enricoros/big-agi/issues/273)

 ### What's New in 1.6.0 - Nov 28, 2023

@@ -148,7 +157,7 @@ Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare

 Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
+[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)

 ## Integrations:

@@ -1,2 +1,2 @@
 export const runtime = 'edge';
-export { openaiStreamingRelayHandler as POST } from '~/modules/llms/transports/server/openai/openai.streaming';
+export { llmStreamingRelayHandler as POST } from '~/modules/llms/server/llm.server.streaming';
@@ -6,7 +6,7 @@ version: '3.9'

 services:
  big-agi:
-    image: ghcr.io/enricoros/big-agi:main
+    image: ghcr.io/enricoros/big-agi:latest
    ports:
      - "3000:3000"
    env_file:
@@ -5,12 +5,24 @@ by release.

 - For the live roadmap, please see [the GitHub project](https://github.com/users/enricoros/projects/4/views/2)

-### 1.8.0 - Dec 2023
+### 1.9.0 - Dec 2023

 - work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
+- milestone: [1.9.0](https://github.com/enricoros/big-agi/milestone/9)

-### What's New in 1.7.3 · Dec 13, 2023 · Attachment Theory 🌟
+### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
+
+- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
+- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
+- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
+- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
+- Mac Shortcuts Fix: Improved UX on Mac
+- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
+- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
+- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
+- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
+
+### What's New in 1.7.0 · Dec 11, 2023 · Attachment Theory

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -20,9 +32,6 @@ by release.
 - Optimized Voice Input and Performance
 - Latest Ollama and Oobabooga models
 - For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
- [1.7.2]: OpenRouter login & free models 🎁
- [1.7.3]: Mistral Platform support. [#273](https://github.com/enricoros/big-agi/issues/273)

 ### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up

@@ -30,5 +30,5 @@ For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/ba

 > NOTE: LocalAI does not list details about the mdoels. Every model is assumed to be
 > capable of chatting, and with a context window of 4096 tokens.
-> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/transports/server/openai/models.data.ts)
+> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/server/openai/models.data.ts)
 > file with the mapping information between LocalAI model IDs and names/descriptions/tokens, etc.
@@ -5,13 +5,15 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
 experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
 editing tools, models switching, personas, and more.

-_Last updated Dec 11, 2023_
+_Last updated Dec 16, 2023_

 ![config-local-ollama-0-example.png](pixels/config-ollama-0-example.png)

 ## Quick Integration Guide

 1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
+   - For detailed instructions on setting up the Ollama API server, please refer to the
+   [Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md). 
 2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
 3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
 4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
@@ -20,21 +22,29 @@ _Last updated Dec 11, 2023_
   you'll have to press the 'Pull' button again, until a green message appears.
 5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas

-### Ollama: installation and Setup
+**Visual Configuration Guide**:

-For detailed instructions on setting up the Ollama API server, please refer to the
-[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
+* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:<br/>
+  <img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" width="320">

-### Visual Guide
+* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:<br/>
+  <img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" width="320">

-* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:
-  <img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" style="max-width: 320px;">
+* You can now switch model/persona dynamically and text/voice chat with the models:<br/>
+  <img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" width="320">

-* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:
-  <img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" style="max-width: 320px;">
+<br/>

-* You can now switch model/persona dynamically and text/voice chat with the models:
-  <img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" style="max-width: 320px;">
+### ⚠️ Network Troubleshooting
+
+If you get errors about the server having trouble connecting with Ollama, please see 
+[this message](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483) on Issue #276.
+
+And in brief, make sure the Ollama endpoint is accessible from the servers where you run big-AGI (which could
+be localhost or cloud servers).
+![Ollama Networking Chart](pixels/config-ollama-network.png)
+
+<br/>

 ### Advanced: Model parameters

@@ -73,6 +83,8 @@ Then, edit the nginx configuration file `/etc/nginx/sites-enabled/default` and a

 Reach out to our community if you need help with this.

+<br/>
+
 ### Community and Support

 Join our community to share your experiences, get help, and discuss best practices:
@@ -83,4 +95,4 @@ Join our community to share your experiences, get help, and discuss best practic
 ---

 `big-AGI` is committed to providing a powerful, intuitive, and privacy-respecting AI experience.
-We are excited for you to explore the possibilities with Ollama models. Happy creating!
+We are excited for you to explore the possibilities with Ollama models. Happy creating!
@@ -21,33 +21,23 @@ Docker ensures faster development cycles, easier collaboration, and seamless env
   ```
 4. Browse to [http://localhost:3000](http://localhost:3000)

-## Documentation
+<br/>

-The big-AGI repository includes a Dockerfile and a GitHub Actions workflow for building and publishing a
-Docker image of the application.
+## Run Official Containers 📦

-### Dockerfile
+`big-AGI` is pre-built from source code and published as a Docker image on the GitHub Container Registry (ghcr).
+The build process is transparent, and happens via GitHub Actions, as described in the
+file.

-The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
-installs dependencies, and creates a production-ready version of the application as a local container.
+### Official Images: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)

-### Official container images
+#### Run using *docker* 🚀

-The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file automates the
-building and publishing of the Docker images to the GitHub Container Registry (ghcr) when changes are
-pushed to the `main` branch.
-
-Official pre-built containers: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
-
-Run official pre-built containers:
 ```bash
-docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
+docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi:latest
 ```

-### Run official containers
-
-In addition, the repository also includes a `docker-compose.yaml` file, configured to run the pre-built
-'ghcr image'. This file is used to define the `big-agi` service, the ports to expose, and the command to run.
+#### Run using *docker-compose* 🚀

 If you have Docker Compose installed, you can run the Docker container with `docker-compose up`
 to pull the Docker image (if it hasn't been pulled already) and start a Docker container. If you want to
@@ -57,4 +47,31 @@ update the image to the latest version, you can run `docker-compose pull` before
 docker-compose up -d
 ```

-Leverage Docker's capabilities for a reliable and efficient big-AGI deployment.
+### Make Local Services Visible to Docker 🌐
+
+To make local services running on your host machine accessible to a Docker container, such as a
+[Browseless](./config-browse.md) service or a local API, you can follow this simplified guide:
+
+| Operating System  | Steps to Make Local Services Visible to Docker                                                                                                                                                                                                                                                                                                                                               |
+|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Windows and macOS | Use the special DNS name `host.docker.internal` to refer to the host machine from within the Docker container. No additional network configuration is required. Access local services using `host.docker.internal:<PORT>`.                                                                                                                                                                   |
+| Linux             | Two options: *A*. Use <ins>--network="host"</ins> (`docker run --network="host" -d big-agi`) when running the Docker container to merge the container within the host network stack; however, this reduces container isolation. Alternatively: *B*. Connect to local services <ins>using the host's IP address</ins> directly, as host.docker.internal is not available by default on Linux. |
+
+<br/>
+
+### More Information
+
+The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
+installs dependencies, and creates a production-ready version of the application as a local container.
+
+The [`docker-compose.yaml`](../docker-compose.yaml) file is configured to run the
+official image (big-agi:latest). This file is used to define the `big-agi` service, to expose
+port 3000 on the host, and launch big-AGI within the container (startup command).
+
+The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file is used
+to build the Official Docker images and publish them to the GitHub Container Registry (ghcr).
+The build process is transparent and happens via GitHub Actions.
+
+<br/>
+
+Leverage Docker's capabilities for a reliable and efficient big-AGI deployment!
@@ -12,7 +12,7 @@ version: '3.9'

 services:
  big-agi:
-    image: ghcr.io/enricoros/big-agi:main
+    image: ghcr.io/enricoros/big-agi:latest
    ports:
      - "3000:3000"
    env_file:
@@ -24,6 +24,7 @@ AZURE_OPENAI_API_ENDPOINT=
 AZURE_OPENAI_API_KEY=
 ANTHROPIC_API_KEY=
 ANTHROPIC_API_HOST=
+GEMINI_API_KEY=
 MISTRAL_API_KEY=
 OLLAMA_API_HOST=
 OPENROUTER_API_KEY=
@@ -46,7 +47,7 @@ PUPPETEER_WSS_ENDPOINT=
 # Backend Analytics
 BACKEND_ANALYTICS=

-# Backend HTTP Basic Authentication
+# Backend HTTP Basic Authentication (see `deploy-authentication.md` for turning on authentication)
 HTTP_BASIC_AUTH_USERNAME=
 HTTP_BASIC_AUTH_PASSWORD=
 ```
@@ -80,6 +81,7 @@ requiring the user to enter an API key
 | `AZURE_OPENAI_API_KEY`      | Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md)                                                    | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
 | `ANTHROPIC_API_KEY`         | The API key for Anthropic                                                                                                     | Optional                                                          |
 | `ANTHROPIC_API_HOST`        | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional                                                          |
+| `GEMINI_API_KEY`            | The API key for Google AI's Gemini                                                                                            | Optional                                                          |
 | `MISTRAL_API_KEY`           | The API key for Mistral                                                                                                       | Optional                                                          |
 | `OLLAMA_API_HOST`           | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md)                                      |                                                                   |
 | `OPENROUTER_API_KEY`        | The API key for OpenRouter                                                                                                    | Optional                                                          |
@@ -115,10 +117,7 @@ Enable the app to Talk, Draw, and Google things up.
 | `PUPPETEER_WSS_ENDPOINT`   | Puppeteer WebSocket endpoint - used for browsing, etc.                                                                  |
 | **Backend**                |                                                                                                                         | 
 | `BACKEND_ANALYTICS`        | Semicolon-separated list of analytics flags (see backend.analytics.ts). Flags: `domain` logs the responding domain.     |
-| `HTTP_BASIC_AUTH_USERNAME` | Username for HTTP Basic Authentication. See the [Authentication](deploy-authentication.md) guide.                       |
+| `HTTP_BASIC_AUTH_USERNAME` | See the [Authentication](deploy-authentication.md) guide. Username for HTTP Basic Authentication.                       |
 | `HTTP_BASIC_AUTH_PASSWORD` | Password for HTTP Basic Authentication.                                                                                 |

 ---
-
-
-
@@ -1,12 +1,12 @@
 {
  "name": "big-agi",
-  "version": "1.7.3",
+  "version": "1.8.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "big-agi",
-      "version": "1.7.3",
+      "version": "1.8.0",
      "hasInstallScript": true,
      "dependencies": {
        "@dqbd/tiktoken": "^1.0.7",
@@ -14,13 +14,13 @@
        "@emotion/react": "^11.11.1",
        "@emotion/server": "^11.11.0",
        "@emotion/styled": "^11.11.0",
-        "@mui/icons-material": "^5.14.19",
-        "@mui/joy": "^5.0.0-beta.17",
+        "@mui/icons-material": "^5.15.0",
+        "@mui/joy": "^5.0.0-beta.18",
        "@next/bundle-analyzer": "^14.0.4",
        "@prisma/client": "^5.7.0",
        "@sanity/diff-match-patch": "^3.1.1",
        "@t3-oss/env-nextjs": "^0.7.1",
-        "@tanstack/react-query": "^4.36.1",
+        "@tanstack/react-query": "~4.36.1",
        "@trpc/client": "^10.44.1",
        "@trpc/next": "^10.44.1",
        "@trpc/react-query": "^10.44.1",
@@ -43,14 +43,14 @@
        "tesseract.js": "^5.0.3",
        "uuid": "^9.0.1",
        "zod": "^3.22.4",
-        "zustand": "~4.3.9"
+        "zustand": "^4.4.7"
      },
      "devDependencies": {
        "@cloudflare/puppeteer": "^0.0.5",
        "@types/node": "^20.10.4",
        "@types/plantuml-encoder": "^1.4.2",
        "@types/prismjs": "^1.26.3",
-        "@types/react": "^18.2.43",
+        "@types/react": "^18.2.45",
        "@types/react-dom": "^18.2.17",
        "@types/react-katex": "^3.0.4",
        "@types/react-timeago": "^4.1.6",
@@ -596,14 +596,14 @@
      }
    },
    "node_modules/@mui/base": {
-      "version": "5.0.0-beta.26",
-      "resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.26.tgz",
-      "integrity": "sha512-gPMRKC84VRw+tjqYoyBzyrBUqHQucMXdlBpYazHa5rCXrb91fYEQk5SqQ2U5kjxx9QxZxTBvWAmZ6DblIgaGhQ==",
+      "version": "5.0.0-beta.27",
+      "resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.27.tgz",
+      "integrity": "sha512-duL37qxihT1N0pW/gyXVezP7SttLkF+cLAs/y6g6ubEFmVadjbnZ45SeF12/vAiKzqwf5M0uFH1cczIPXFZygA==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
+        "@babel/runtime": "^7.23.5",
        "@floating-ui/react-dom": "^2.0.4",
-        "@mui/types": "^7.2.10",
-        "@mui/utils": "^5.14.20",
+        "@mui/types": "^7.2.11",
+        "@mui/utils": "^5.15.0",
        "@popperjs/core": "^2.11.8",
        "clsx": "^2.0.0",
        "prop-types": "^15.8.1"
@@ -627,20 +627,20 @@
      }
    },
    "node_modules/@mui/core-downloads-tracker": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/core-downloads-tracker/-/core-downloads-tracker-5.14.20.tgz",
-      "integrity": "sha512-fXoGe8VOrIYajqALysFuyal1q1YmBARqJ3tmnWYDVl0scu8f6h6tZQbS2K8BY28QwkWNGyv4WRfuUkzN5HR3Ow==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/core-downloads-tracker/-/core-downloads-tracker-5.15.0.tgz",
+      "integrity": "sha512-NpGtlHwuyLfJtdrlERXb8qRqd279O0VnuGaZAor1ehdNhUJOD1bSxHDeXKZkbqNpvi50hasFj7lsbTpluworTQ==",
      "funding": {
        "type": "opencollective",
        "url": "https://opencollective.com/mui-org"
      }
    },
    "node_modules/@mui/icons-material": {
-      "version": "5.14.19",
-      "resolved": "https://registry.npmjs.org/@mui/icons-material/-/icons-material-5.14.19.tgz",
-      "integrity": "sha512-yjP8nluXxZGe3Y7pS+yxBV+hWZSsSBampCxkZwaw+1l+feL+rfP74vbEFbMrX/Kil9I/Y1tWfy5bs/eNvwNpWw==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/icons-material/-/icons-material-5.15.0.tgz",
+      "integrity": "sha512-zHY6fOkaK7VfhWeyxO8MjO3IAjEYpYMXuqUhX7TkUZJ9+TSH/9dn4ClG4K2j6hdgBU5Yrq2Z/89Bo6BHHp7AdQ==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4"
+        "@babel/runtime": "^7.23.5"
      },
      "engines": {
        "node": ">=12.0.0"
@@ -661,16 +661,16 @@
      }
    },
    "node_modules/@mui/joy": {
-      "version": "5.0.0-beta.17",
-      "resolved": "https://registry.npmjs.org/@mui/joy/-/joy-5.0.0-beta.17.tgz",
-      "integrity": "sha512-KQMfQe7P98jRYWcjTxLRnjAlWre0YGvZstpE+xNJyOn6aTnMomnAskMIG0s2+k5PcluyxTEZZKZZ0Usl3M5D6g==",
+      "version": "5.0.0-beta.18",
+      "resolved": "https://registry.npmjs.org/@mui/joy/-/joy-5.0.0-beta.18.tgz",
+      "integrity": "sha512-TxEo7kqEnbjB5S8cyFrytWjzhxW12UxkEJOT0QM8WpwaBN3Ie1okFuo2bnFW94vYFZperW97/H/08cqqS/2JPA==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
-        "@mui/base": "5.0.0-beta.26",
-        "@mui/core-downloads-tracker": "^5.14.20",
-        "@mui/system": "^5.14.20",
-        "@mui/types": "^7.2.10",
-        "@mui/utils": "^5.14.20",
+        "@babel/runtime": "^7.23.5",
+        "@mui/base": "5.0.0-beta.27",
+        "@mui/core-downloads-tracker": "^5.15.0",
+        "@mui/system": "^5.15.0",
+        "@mui/types": "^7.2.11",
+        "@mui/utils": "^5.15.0",
        "clsx": "^2.0.0",
        "prop-types": "^15.8.1"
      },
@@ -701,17 +701,17 @@
      }
    },
    "node_modules/@mui/material": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/material/-/material-5.14.20.tgz",
-      "integrity": "sha512-SUcPZnN6e0h1AtrDktEl76Dsyo/7pyEUQ+SAVe9XhHg/iliA0b4Vo+Eg4HbNkELsMbpDsUF4WHp7rgflPG7qYQ==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/material/-/material-5.15.0.tgz",
+      "integrity": "sha512-60CDI/hQNwJv9a3vEZtFG7zz0USdQhVwpBd3fZqrzhuXSdiMdYMaZcCXeX/KMuNq0ZxQEAZd74Pv+gOb408QVA==",
      "peer": true,
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
-        "@mui/base": "5.0.0-beta.26",
-        "@mui/core-downloads-tracker": "^5.14.20",
-        "@mui/system": "^5.14.20",
-        "@mui/types": "^7.2.10",
-        "@mui/utils": "^5.14.20",
+        "@babel/runtime": "^7.23.5",
+        "@mui/base": "5.0.0-beta.27",
+        "@mui/core-downloads-tracker": "^5.15.0",
+        "@mui/system": "^5.15.0",
+        "@mui/types": "^7.2.11",
+        "@mui/utils": "^5.15.0",
        "@types/react-transition-group": "^4.4.9",
        "clsx": "^2.0.0",
        "csstype": "^3.1.2",
@@ -746,12 +746,12 @@
      }
    },
    "node_modules/@mui/private-theming": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/private-theming/-/private-theming-5.14.20.tgz",
-      "integrity": "sha512-WV560e1vhs2IHCh0pgUaWHznrcrVoW9+cDCahU1VTkuwPokWVvb71ccWQ1f8Y3tRBPPcNkU2dChkkRJChLmQlQ==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/private-theming/-/private-theming-5.15.0.tgz",
+      "integrity": "sha512-7WxtIhXxNek0JjtsYy+ut2LtFSLpsUW5JSDehQO+jF7itJ8ehy7Bd9bSt2yIllbwGjCFowLfYpPk2Ykgvqm1tA==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
-        "@mui/utils": "^5.14.20",
+        "@babel/runtime": "^7.23.5",
+        "@mui/utils": "^5.15.0",
        "prop-types": "^15.8.1"
      },
      "engines": {
@@ -772,11 +772,11 @@
      }
    },
    "node_modules/@mui/styled-engine": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/styled-engine/-/styled-engine-5.14.20.tgz",
-      "integrity": "sha512-Vs4nGptd9wRslo9zeRkuWcZeIEp+oYbODy+fiZKqqr4CH1Gfi9fdP0Q1tGYk8OiJ2EPB/tZSAyOy62Hyp/iP7g==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/styled-engine/-/styled-engine-5.15.0.tgz",
+      "integrity": "sha512-6NysIsHkuUS2lF+Lzv1jiK3UjBJk854/vKVcJQVGKlPiqNEVZJNlwaSpsaU5xYXxWEZYfbVFSAomLOS/LV/ovQ==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
+        "@babel/runtime": "^7.23.5",
        "@emotion/cache": "^11.11.0",
        "csstype": "^3.1.2",
        "prop-types": "^15.8.1"
@@ -803,15 +803,15 @@
      }
    },
    "node_modules/@mui/system": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/system/-/system-5.14.20.tgz",
-      "integrity": "sha512-jKOGtK4VfYZG5kdaryUHss4X6hzcfh0AihT8gmnkfqRtWP7xjY+vPaUhhuSeibE5sqA5wCtdY75z6ep9pxFnIg==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/system/-/system-5.15.0.tgz",
+      "integrity": "sha512-8TPjfTlYBNB7/zBJRL4QOD9kImwdZObbiYNh0+hxvhXr2koezGx8USwPXj8y/JynbzGCkIybkUztCdWlMZe6OQ==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
-        "@mui/private-theming": "^5.14.20",
-        "@mui/styled-engine": "^5.14.19",
-        "@mui/types": "^7.2.10",
-        "@mui/utils": "^5.14.20",
+        "@babel/runtime": "^7.23.5",
+        "@mui/private-theming": "^5.15.0",
+        "@mui/styled-engine": "^5.15.0",
+        "@mui/types": "^7.2.11",
+        "@mui/utils": "^5.15.0",
        "clsx": "^2.0.0",
        "csstype": "^3.1.2",
        "prop-types": "^15.8.1"
@@ -842,9 +842,9 @@
      }
    },
    "node_modules/@mui/types": {
-      "version": "7.2.10",
-      "resolved": "https://registry.npmjs.org/@mui/types/-/types-7.2.10.tgz",
-      "integrity": "sha512-wX1vbDC+lzF7FlhT6A3ffRZgEoKWPF8VqRoTu4lZwouFX2t90KyCMsgepMw5DxLak1BSp/KP86CmtZttikb/gQ==",
+      "version": "7.2.11",
+      "resolved": "https://registry.npmjs.org/@mui/types/-/types-7.2.11.tgz",
+      "integrity": "sha512-KWe/QTEsFFlFSH+qRYf3zoFEj3z67s+qAuSnMMg+gFwbxG7P96Hm6g300inQL1Wy///gSRb8juX7Wafvp93m3w==",
      "peerDependencies": {
        "@types/react": "^17.0.0 || ^18.0.0"
      },
@@ -855,11 +855,11 @@
      }
    },
    "node_modules/@mui/utils": {
-      "version": "5.14.20",
-      "resolved": "https://registry.npmjs.org/@mui/utils/-/utils-5.14.20.tgz",
-      "integrity": "sha512-Y6yL5MoFmtQml20DZnaaK1znrCEwG6/vRSzW8PKOTrzhyqKIql0FazZRUR7sA5EPASgiyKZfq0FPwISRXm5NdA==",
+      "version": "5.15.0",
+      "resolved": "https://registry.npmjs.org/@mui/utils/-/utils-5.15.0.tgz",
+      "integrity": "sha512-XSmTKStpKYamewxyJ256+srwEnsT3/6eNo6G7+WC1tj2Iq9GfUJ/6yUoB7YXjOD2jTZ3XobToZm4pVz1LBt6GA==",
      "dependencies": {
-        "@babel/runtime": "^7.23.4",
+        "@babel/runtime": "^7.23.5",
        "@types/prop-types": "^15.7.11",
        "prop-types": "^15.8.1",
        "react-is": "^18.2.0"
@@ -1377,9 +1377,9 @@
      "integrity": "sha512-ga8y9v9uyeiLdpKddhxYQkxNDrfvuPrlFb0N1qnZZByvcElJaXthF1UhvCh9TLWJBEHeNtdnbysW7Y6Uq8CVng=="
    },
    "node_modules/@types/react": {
-      "version": "18.2.43",
-      "resolved": "https://registry.npmjs.org/@types/react/-/react-18.2.43.tgz",
-      "integrity": "sha512-nvOV01ZdBdd/KW6FahSbcNplt2jCJfyWdTos61RYHV+FVv5L/g9AOX1bmbVcWcLFL8+KHQfh1zVIQrud6ihyQA==",
+      "version": "18.2.45",
+      "resolved": "https://registry.npmjs.org/@types/react/-/react-18.2.45.tgz",
+      "integrity": "sha512-TtAxCNrlrBp8GoeEp1npd5g+d/OejJHFxS3OWmrPBMFaVQMSN0OFySozJio5BHxTuTeug00AVXVAjfDSfk+lUg==",
      "dependencies": {
        "@types/prop-types": "*",
        "@types/scheduler": "*",
@@ -7263,9 +7263,9 @@
      }
    },
    "node_modules/zustand": {
-      "version": "4.3.9",
-      "resolved": "https://registry.npmjs.org/zustand/-/zustand-4.3.9.tgz",
-      "integrity": "sha512-Tat5r8jOMG1Vcsj8uldMyqYKC5IZvQif8zetmLHs9WoZlntTHmIoNM8TpLRY31ExncuUvUOXehd0kvahkuHjDw==",
+      "version": "4.4.7",
+      "resolved": "https://registry.npmjs.org/zustand/-/zustand-4.4.7.tgz",
+      "integrity": "sha512-QFJWJMdlETcI69paJwhSMJz7PPWjVP8Sjhclxmxmxv/RYI7ZOvR5BHX+ktH0we9gTWQMxcne8q1OY8xxz604gw==",
      "dependencies": {
        "use-sync-external-store": "1.2.0"
      },
@@ -7273,10 +7273,14 @@
        "node": ">=12.7.0"
      },
      "peerDependencies": {
+        "@types/react": ">=16.8",
        "immer": ">=9.0",
        "react": ">=16.8"
      },
      "peerDependenciesMeta": {
+        "@types/react": {
+          "optional": true
+        },
        "immer": {
          "optional": true
        },
@@ -1,6 +1,6 @@
 {
  "name": "big-agi",
-  "version": "1.7.3",
+  "version": "1.8.0",
  "private": true,
  "scripts": {
    "dev": "next dev",
@@ -18,13 +18,13 @@
    "@emotion/react": "^11.11.1",
    "@emotion/server": "^11.11.0",
    "@emotion/styled": "^11.11.0",
-    "@mui/icons-material": "^5.14.19",
-    "@mui/joy": "^5.0.0-beta.17",
+    "@mui/icons-material": "^5.15.0",
+    "@mui/joy": "^5.0.0-beta.18",
    "@next/bundle-analyzer": "^14.0.4",
    "@prisma/client": "^5.7.0",
    "@sanity/diff-match-patch": "^3.1.1",
    "@t3-oss/env-nextjs": "^0.7.1",
-    "@tanstack/react-query": "^4.36.1",
+    "@tanstack/react-query": "~4.36.1",
    "@trpc/client": "^10.44.1",
    "@trpc/next": "^10.44.1",
    "@trpc/react-query": "^10.44.1",
@@ -47,14 +47,14 @@
    "tesseract.js": "^5.0.3",
    "uuid": "^9.0.1",
    "zod": "^3.22.4",
-    "zustand": "~4.3.9"
+    "zustand": "^4.4.7"
  },
  "devDependencies": {
    "@cloudflare/puppeteer": "^0.0.5",
    "@types/node": "^20.10.4",
    "@types/plantuml-encoder": "^1.4.2",
    "@types/prismjs": "^1.26.3",
-    "@types/react": "^18.2.43",
+    "@types/react": "^18.2.45",
    "@types/react-dom": "^18.2.17",
    "@types/react-katex": "^3.0.4",
    "@types/react-timeago": "^4.1.6",
@@ -11,6 +11,7 @@ import '~/common/styles/CodePrism.css';
 import '~/common/styles/GithubMarkdown.css';

 import { ProviderBackend } from '~/common/state/ProviderBackend';
+import { ProviderSingleTab } from '~/common/state/ProviderSingleTab';
 import { ProviderSnacks } from '~/common/state/ProviderSnacks';
 import { ProviderTRPCQueryClient } from '~/common/state/ProviderTRPCQueryClient';
 import { ProviderTheming } from '~/common/state/ProviderTheming';
@@ -25,13 +26,15 @@ const MyApp = ({ Component, emotionCache, pageProps }: MyAppProps) =>
    </Head>

    <ProviderTheming emotionCache={emotionCache}>
-      <ProviderTRPCQueryClient>
-        <ProviderSnacks>
-          <ProviderBackend>
-            <Component {...pageProps} />
-          </ProviderBackend>
-        </ProviderSnacks>
-      </ProviderTRPCQueryClient>
+      <ProviderSingleTab>
+        <ProviderTRPCQueryClient>
+          <ProviderSnacks>
+            <ProviderBackend>
+              <Component {...pageProps} />
+            </ProviderBackend>
+          </ProviderSnacks>
+        </ProviderTRPCQueryClient>
+      </ProviderSingleTab>
    </ProviderTheming>

    <VercelAnalytics debug={false} />
@@ -15,8 +15,7 @@ import { useChatLLMDropdown } from '../chat/components/applayout/useLLMDropdown'

 import { EXPERIMENTAL_speakTextStream } from '~/modules/elevenlabs/elevenlabs.client';
 import { SystemPurposeId, SystemPurposes } from '../../data';
-import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-import { streamChat } from '~/modules/llms/transports/streamChat';
+import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
 import { useElevenLabsVoiceDropdown } from '~/modules/elevenlabs/useElevenLabsVoiceDropdown';

 import { Link } from '~/common/components/Link';
@@ -216,7 +215,7 @@ export function CallUI(props: {
    responseAbortController.current = new AbortController();
    let finalText = '';
    let error: any | null = null;
-    streamChat(chatLLMId, callPrompt, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
+    llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
      const text = updatedMessage.text?.trim();
      if (text) {
        finalText = text;
@@ -3,7 +3,7 @@ import * as React from 'react';
 import { Chip, ColorPaletteProp, VariantProp } from '@mui/joy';
 import { SxProps } from '@mui/joy/styles/types';

-import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import type { VChatMessageIn } from '~/modules/llms/llm.client';


 export function CallMessage(props: {
@@ -167,6 +167,8 @@ function explainErrorInMessage(text: string, isAssistant: boolean, modelId?: str
      make sure the usage is under <Link noLinkStyle href='https://platform.openai.com/account/billing/limits' target='_blank'>the limits</Link>.
    </>;
  }
+  // else
+  //  errorMessage = <>{text || 'Unknown error'}</>;

  return { errorMessage, isAssistantError };
 }
@@ -2,8 +2,8 @@ import { DLLMId } from '~/modules/llms/store-llms';
 import { SystemPurposeId } from '../../../data';
 import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
 import { autoTitle } from '~/modules/aifn/autotitle/autoTitle';
+import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
 import { speakText } from '~/modules/elevenlabs/elevenlabs.client';
-import { streamChat } from '~/modules/llms/transports/streamChat';

 import { DMessage, useChatStore } from '~/common/state/store-chats';

@@ -63,7 +63,7 @@ async function streamAssistantMessage(
  const messages = history.map(({ role, text }) => ({ role, content: text }));

  try {
-    await streamChat(llmId, messages, abortSignal,
+    await llmStreamingChatGenerate(llmId, messages, null, null, abortSignal,
      (updatedMessage: Partial<DMessage>) => {
        // update the message in the store (and thus schedule a re-render)
        editMessage(updatedMessage);
@@ -78,14 +78,14 @@ export function AppNews() {

        {!!news && <Container disableGutters maxWidth='sm'>
          {news?.map((ni, idx) => {
-            const firstCard = idx === 0;
+            // const firstCard = idx === 0;
            const hasCardAfter = news.length < NewsItems.length;
            const showExpander = hasCardAfter && (idx === news.length - 1);
            const addPadding = false; //!firstCard; // || showExpander;
            return <Card key={'news-' + idx} sx={{ mb: 2, minHeight: 32 }}>
              <CardContent sx={{ position: 'relative', pr: addPadding ? 4 : 0 }}>
-                <Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 1 }}>
-                  <GoodTooltip title={ni.versionName || null} placement='top-start'>
+                <Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 0 }}>
+                  <GoodTooltip title={ni.versionName ? `${ni.versionName} ${ni.versionMoji || ''}` : null} placement='top-start'>
                    <Typography level='title-sm' component='div' sx={{ flexGrow: 1 }}>
                      {ni.text ? ni.text : ni.versionName ? `${ni.versionCode} · ${ni.versionName}` : `Version ${ni.versionCode}:`}
                    </Typography>
@@ -10,10 +10,10 @@ import { platformAwareKeystrokes } from '~/common/components/KeyStroke';


 // update this variable every time you want to broadcast a new version to clients
-export const incrementalVersion: number = 8;
+export const incrementalVersion: number = 9;

 const B = (props: { href?: string, children: React.ReactNode }) => {
-  const boldText = <Typography color={!!props.href ? 'primary' : 'warning'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
+  const boldText = <Typography color={!!props.href ? 'primary' : 'neutral'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
  return props.href ?
    <Link href={props.href + clientUtmSource()} target='_blank' sx={{ /*textDecoration: 'underline'*/ }}>{boldText} <LaunchIcon sx={{ ml: 1 }} /></Link> :
    boldText;
@@ -27,11 +27,12 @@ const RIssues = `${OpenRepo}/issues`;
 export const newsCallout =
  <Card>
    <CardContent sx={{ gap: 2 }}>
-      <Typography level='h4'>
+      <Typography level='title-lg'>
        Open Roadmap
      </Typography>
-      <Typography>
-        The roadmap is officially out. For the first time you get a look at what&apos;s brewing, up and coming, and get a chance to pick up cool features!
+      <Typography level='body-md'>
+        Take a peek at our roadmap to see what&apos;s in the pipeline.
+        Discover upcoming features and let us know what excites you the most!
      </Typography>
      <Grid container spacing={1}>
        <Grid xs={12} sm={7}>
@@ -39,7 +40,7 @@ export const newsCallout =
            fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
            component={Link} href={OpenProject} noLinkStyle target='_blank'
          >
-            Explore the Roadmap
+            Explore
          </Button>
        </Grid>
        <Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
@@ -67,10 +68,27 @@ export const NewsItems: NewsItem[] = [
    ],
  },*/
  {
-    versionCode: '1.7.3',
+    versionCode: '1.8.0',
+    versionName: 'To The Moon And Back',
+    versionMoji: '🚀🌕🔙❤️',
+    versionDate: new Date('2023-12-20T09:30:00Z'),
+    items: [
+      { text: <><B href={RIssues + '/275'}>Google Gemini</B> models support</> },
+      { text: <><B href={RIssues + '/273'}>Mistral Platform</B> support</> },
+      { text: <><B href={RIssues + '/270'}>Ollama chats</B> perfection</> },
+      { text: <>Custom <B href={RIssues + '/280'}>diagrams instructions</B> (@joriskalz)</> },
+      { text: <><B>Single-Tab</B> mode, enhances data integrity and prevents DB corruption</> },
+      { text: <>Updated Ollama (v0.1.17) and OpenRouter models</> },
+      { text: <>More: fixed ⌘ shortcuts on Mac</> },
+      { text: <><Link href='https://big-agi.com'>Website</Link>: official downloads</> },
+      { text: <>Easier Vercel deployment, documented <Link href='https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483'>network troubleshooting</Link></>, dev: true },
+    ],
+  },
+  {
+    versionCode: '1.7.0',
    versionName: 'Attachment Theory',
-    versionDate: new Date('2023-12-11T06:00:00Z'), // new Date().toISOString()
-    // versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
+    // versionDate: new Date('2023-12-11T06:00:00Z'), // 1.7.3
+    versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
    items: [
      { text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
      { text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
@@ -80,9 +98,6 @@ export const NewsItems: NewsItem[] = [
      { text: <>{platformAwareKeystrokes('Ctrl+Shift+O')}: quick access to model options</> },
      { text: <>Optimized voice input and performance</> },
      { text: <>Latest Ollama and Oobabooga models</> },
-      { text: <>1.7.1: Improved <B href={RIssues + '/270'}>Ollama chats</B></> },
-      { text: <>1.7.2: Updated OpenRouter models 🎁</> },
-      { text: <>1.7.3: <B href={RIssues + '/273'}>Mistral Platform</B> support</> },
    ],
  },
  {
@@ -162,6 +177,7 @@ export const NewsItems: NewsItem[] = [
 interface NewsItem {
  versionCode: string;
  versionName?: string;
+  versionMoji?: string;
  versionDate?: Date;
  text?: string | React.JSX.Element;
  items?: {
@@ -1,14 +1,13 @@
 import * as React from 'react';
 import { shallow } from 'zustand/shallow';
-import { useRouter } from 'next/router';

+import { navigateToNews } from '~/common/app.routes';
 import { useAppStateStore } from '~/common/state/store-appstate';

 import { incrementalVersion } from './news.data';


 export function useShowNewsOnUpdate() {
-  const { push: routerPush } = useRouter();
  const { usageCount, lastSeenNewsVersion } = useAppStateStore(state => ({
    usageCount: state.usageCount,
    lastSeenNewsVersion: state.lastSeenNewsVersion,
@@ -17,9 +16,9 @@ export function useShowNewsOnUpdate() {
    const isNewsOutdated = (lastSeenNewsVersion || 0) < incrementalVersion;
    if (isNewsOutdated && usageCount > 2) {
      // Disable for now
-      void routerPush('/news');
+      void navigateToNews();
    }
-  }, [lastSeenNewsVersion, routerPush, usageCount]);
+  }, [lastSeenNewsVersion, usageCount]);
 }

 export function useMarkNewsAsSeen() {
@@ -1,7 +1,7 @@
 import * as React from 'react';

 import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
-import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 export interface LLMChainStep {
@@ -80,7 +80,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
    _chainAbortController.signal.addEventListener('abort', globalToStepListener);

    // LLM call
-    callChatGenerate(llmId, llmChatInput, chain.overrideResponseTokens)
+    llmChatGenerateOrThrow(llmId, llmChatInput, null, null, chain.overrideResponseTokens)
      .then(({ content }) => {
        stepDone = true;
        if (!stepAbortController.signal.aborted)
@@ -7,6 +7,7 @@
 import Router from 'next/router';

 import type { DConversationId } from '~/common/state/store-chats';
+import { isBrowser } from './util/pwaUtils';


 export const ROUTE_INDEX = '/';
@@ -15,7 +16,8 @@ export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
 export const ROUTE_APP_NEWS = '/news';
 const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';

-export const getIndexLink = () => ROUTE_INDEX;
+
+// Get Paths

 export const getCallbackUrl = (source: 'openrouter') => {
  const callbackUrl = new URL(window.location.href);
@@ -31,10 +33,11 @@ export const getCallbackUrl = (source: 'openrouter') => {

 export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);

-const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
-  Router[replace ? 'replace' : 'push'](path);
+
+/// Simple Navigation

 export const navigateToIndex = navigateFn(ROUTE_INDEX);
+
 export const navigateToChat = async (conversationId?: DConversationId) => {
  if (conversationId) {
    await Router.push(
@@ -54,6 +57,15 @@ export const navigateToNews = navigateFn(ROUTE_APP_NEWS);

 export const navigateBack = Router.back;

+export const reloadPage = () => isBrowser && window.location.reload();
+
+function navigateFn(path: string) {
+  return (replace?: boolean): Promise<boolean> => Router[replace ? 'replace' : 'push'](path);
+}
+
+
+/// Launch Apps
+
 export interface AppCallQueryParams {
  conversationId: string;
  personaId: string;
@@ -21,8 +21,13 @@ export const useGlobalShortcut = (shortcutKey: string | false, useCtrl: boolean,
    if (!shortcutKey) return;
    const lcShortcut = shortcutKey.toLowerCase();
    const handleKeyDown = (event: KeyboardEvent) => {
-      if ((useCtrl === event.ctrlKey) && (useShift === event.shiftKey) && (useAlt === event.altKey)
-        && event.key.toLowerCase() === lcShortcut) {
+      const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
+      if (
+        (useCtrl === isCtrlOrCmd) &&
+        (useShift === event.shiftKey) &&
+        (useAlt === event.altKey) &&
+        event.key.toLowerCase() === lcShortcut
+      ) {
        event.preventDefault();
        event.stopPropagation();
        callback();
@@ -46,9 +51,10 @@ export const useGlobalShortcuts = (shortcuts: GlobalShortcutItem[]) => {
  React.useEffect(() => {
    const handleKeyDown = (event: KeyboardEvent) => {
      for (const [key, useCtrl, useShift, useAlt, action] of shortcuts) {
+        const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
        if (
          key &&
-          (useCtrl === event.ctrlKey) &&
+          (useCtrl === isCtrlOrCmd) &&
          (useShift === event.shiftKey) &&
          (useAlt === event.altKey) &&
          event.key.toLowerCase() === key.toLowerCase()
@@ -0,0 +1,95 @@
+import * as React from 'react';
+
+/**
+ * The AloneDetector class checks if the current client is the only one present for a given app. It uses
+ * BroadcastChannel to talk to other clients. If no other clients reply within a short time, it assumes it's
+ * the only one and tells the caller.
+ */
+class AloneDetector {
+  private readonly clientId: string;
+  private readonly broadcastChannel: BroadcastChannel;
+
+  private aloneCallback: ((isAlone: boolean) => void) | null;
+  private aloneTimerId: number | undefined;
+
+  constructor(channelName: string, onAlone: (isAlone: boolean) => void) {
+
+    this.clientId = Math.random().toString(36).substring(2, 10);
+    this.aloneCallback = onAlone;
+
+    this.broadcastChannel = new BroadcastChannel(channelName);
+    this.broadcastChannel.onmessage = this.handleIncomingMessage;
+
+  }
+
+  public onUnmount(): void {
+    // close channel
+    this.broadcastChannel.onmessage = null;
+    this.broadcastChannel.close();
+
+    // clear timeout
+    if (this.aloneTimerId)
+      clearTimeout(this.aloneTimerId);
+
+    this.aloneTimerId = undefined;
+    this.aloneCallback = null;
+  }
+
+  public checkIfAlone(): void {
+
+    // triggers other clients
+    this.broadcastChannel.postMessage({ type: 'CHECK', sender: this.clientId });
+
+    // if no response within 500ms, assume this client is alone
+    this.aloneTimerId = window.setTimeout(() => {
+      this.aloneTimerId = undefined;
+      this.aloneCallback?.(true);
+    }, 500);
+
+  }
+
+  private handleIncomingMessage = (event: MessageEvent): void => {
+
+    // ignore self messages
+    if (event.data.sender === this.clientId) return;
+
+    switch (event.data.type) {
+
+      case 'CHECK':
+        this.broadcastChannel.postMessage({ type: 'ALIVE', sender: this.clientId });
+        break;
+
+      case 'ALIVE':
+        // received an ALIVE message, tell the client they're not alone
+        if (this.aloneTimerId) {
+          clearTimeout(this.aloneTimerId);
+          this.aloneTimerId = undefined;
+        }
+        this.aloneCallback?.(false);
+        this.aloneCallback = null;
+        break;
+
+    }
+  };
+}
+
+
+/**
+ * React hook that checks whether the current tab is the only one open for a specific channel.
+ *
+ * @param {string} channelName - The name of the BroadcastChannel to communicate on.
+ * @returns {boolean | null} - True if the current tab is alone, false if not, or null before the check completes.
+ */
+export function useSingleTabEnforcer(channelName: string): boolean | null {
+  const [isAlone, setIsAlone] = React.useState<boolean | null>(null);
+
+  React.useEffect(() => {
+    const tabManager = new AloneDetector(channelName, setIsAlone);
+    tabManager.checkIfAlone();
+    return () => {
+      tabManager.onUnmount();
+    };
+  }, [channelName]);
+
+  return isAlone;
+}
@@ -3,7 +3,7 @@ import { shallow } from 'zustand/shallow';

 import { Box, Container } from '@mui/joy';

-import { ModelsModal } from '../../apps/models-modal/ModelsModal';
+import { ModelsModal } from '~/modules/llms/models-modal/ModelsModal';
 import { SettingsModal } from '../../apps/settings-modal/SettingsModal';
 import { ShortcutsModal } from '../../apps/settings-modal/ShortcutsModal';

@@ -0,0 +1,42 @@
+import * as React from 'react';
+
+import { Button, Sheet, Typography } from '@mui/joy';
+
+import { Brand } from '../app.config';
+import { reloadPage } from '../app.routes';
+import { useSingleTabEnforcer } from '../components/useSingleTabEnforcer';
+
+
+export const ProviderSingleTab = (props: { children: React.ReactNode }) => {
+
+  // state
+  const isSingleTab = useSingleTabEnforcer('big-agi-tabs');
+
+  // pass-through until we know for sure that other tabs are open
+  if (isSingleTab === null || isSingleTab)
+    return props.children;
+
+
+  return (
+    <Sheet
+      variant='solid'
+      invertedColors
+      sx={{
+        flexGrow: 1,
+        display: 'flex', flexDirection: { xs: 'column', md: 'row' }, justifyContent: 'center', alignItems: 'center', gap: 2,
+        p: 3,
+      }}
+    >
+
+      <Typography>
+        It looks like {Brand.Title.Base} is already running in another tab or window.
+        To continue here, please close the other instance first.
+      </Typography>
+
+      <Button onClick={reloadPage}>
+        Reload
+      </Button>
+
+    </Sheet>
+  );
+};
@@ -1,4 +1,4 @@
-import { callChatGenerateWithFunctions, VChatFunctionIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';

 import { useChatStore } from '~/common/state/store-chats';
@@ -71,7 +71,7 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri

  // Follow-up: Question
  if (suggestQuestions) {
-    // callChatGenerateWithFunctions(funcLLMId, [
+    // llmChatGenerateOrThrow(funcLLMId, [
    //     { role: 'system', content: systemMessage.text },
    //     { role: 'user', content: userMessage.text },
    //     { role: 'assistant', content: assistantMessageText },
@@ -83,15 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri

  // Follow-up: Auto-Diagrams
  if (suggestDiagrams) {
-    void callChatGenerateWithFunctions(funcLLMId, [
+    void llmChatGenerateOrThrow(funcLLMId, [
        { role: 'system', content: systemMessage.text },
        { role: 'user', content: userMessage.text },
        { role: 'assistant', content: assistantMessageText },
      ], [suggestPlantUMLFn], 'draw_plantuml_diagram',
    ).then(chatResponse => {

+      if (!('function_arguments' in chatResponse))
+        return;
+
      // parse the output PlantUML string, if any
-      const functionArguments = chatResponse?.function_arguments ?? null;
+      const functionArguments = chatResponse.function_arguments ?? null;
      if (functionArguments) {
        const { code, type }: { code: string, type: string } = functionArguments as any;
        if (code && type) {
@@ -105,6 +108,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
          editMessage(conversationId, assistantMessageId, { text: assistantMessageText }, false);
        }
      }
+    }).catch(err => {
+      console.error('autoSuggestions::diagram:', err);
    });
  }

@@ -1,4 +1,4 @@
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';

 import { useChatStore } from '~/common/state/store-chats';
@@ -27,7 +27,7 @@ export function autoTitle(conversationId: string) {
  });

  // LLM
-  void callChatGenerate(fastLLMId, [
+  void llmChatGenerateOrThrow(fastLLMId, [
    { role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
    {
      role: 'user', content:
@@ -39,7 +39,7 @@ export function autoTitle(conversationId: string) {
        historyLines.join('\n') +
        '```\n',
    },
-  ]).then(chatResponse => {
+  ], null, null).then(chatResponse => {

    const title = chatResponse?.content
      ?.trim()
@@ -1,6 +1,6 @@
 import * as React from 'react';

-import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton } from '@mui/joy';
+import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton, Input, FormControl, FormLabel } from '@mui/joy';
 import AccountTreeIcon from '@mui/icons-material/AccountTree';
 import ExpandLessIcon from '@mui/icons-material/ExpandLess';
 import ExpandMoreIcon from '@mui/icons-material/ExpandMore';
@@ -8,8 +8,9 @@ import ReplayIcon from '@mui/icons-material/Replay';
 import StopOutlinedIcon from '@mui/icons-material/StopOutlined';
 import TelegramIcon from '@mui/icons-material/Telegram';

+import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
+
 import { ChatMessage } from '../../../apps/chat/components/message/ChatMessage';
-import { streamChat } from '~/modules/llms/transports/streamChat';

 import { GoodModal } from '~/common/components/GoodModal';
 import { InlineError } from '~/common/components/InlineError';
@@ -48,6 +49,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
  const [message, setMessage] = React.useState<DMessage | null>(null);
  const [diagramType, diagramComponent] = useFormRadio<DiagramType>('auto', diagramTypes, 'Visualize');
  const [diagramLanguage, languageComponent] = useFormRadio<DiagramLanguage>('plantuml', diagramLanguages, 'Style');
+  const [customInstruction, setCustomInstruction] = React.useState<string>('');
  const [errorMessage, setErrorMessage] = React.useState<string | null>(null);
  const [abortController, setAbortController] = React.useState<AbortController | null>(null);

@@ -81,10 +83,10 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
    const stepAbortController = new AbortController();
    setAbortController(stepAbortController);

-    const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject);
+    const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);

    try {
-      await streamChat(diagramLlm.id, diagramPrompt, stepAbortController.signal,
+      await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
        (update: Partial<{ text: string, typing: boolean, originLLM: string }>) => {
          assistantMessage = { ...assistantMessage, ...update };
          setMessage(assistantMessage);
@@ -103,7 +105,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
      setAbortController(null);
    }

-  }, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject]);
+  }, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);


  // [Effect] Auto-abort on unmount
@@ -149,6 +151,12 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
        <Grid xs={12} xl={6}>
          {llmComponent}
        </Grid>
+        <Grid xs={12} md={6}>
+            <FormControl>
+              <FormLabel>Custom Instruction</FormLabel>
+              <Input title="Custom Instruction" placeholder='e.g. visualize as state' value={customInstruction}  onChange={(e) => setCustomInstruction(e.target.value)} />
+            </FormControl>
+          </Grid>
      </Grid>
    )}

@@ -1,6 +1,5 @@
-import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-
 import type { FormRadioOption } from '~/common/components/forms/FormRadioControl';
+import type { VChatMessageIn } from '~/modules/llms/llm.client';


 export type DiagramType = 'auto' | 'mind';
@@ -60,12 +59,15 @@ function plantumlDiagramPrompt(diagramType: DiagramType): { sys: string, usr: st
  }
 }

-export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string): VChatMessageIn[] {
+export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string, customInstruction: string): VChatMessageIn[] {
  const { sys, usr } = diagramLanguage === 'mermaid' ? mermaidDiagramPrompt(diagramType) : plantumlDiagramPrompt(diagramType);
+  if (customInstruction) {
+    customInstruction = 'Also consider the following instructions: ' + customInstruction;
+  }
  return [
    { role: 'system', content: sys },
    { role: 'system', content: chatSystemPrompt },
    { role: 'assistant', content: subject },
-    { role: 'user', content: usr },
+    { role: 'user', content: `${usr} ${customInstruction}` },
  ];
 }
@@ -1,4 +1,4 @@
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';


@@ -14,10 +14,10 @@ export async function imaginePromptFromText(messageText: string): Promise<string
  const { fastLLMId } = useModelsStore.getState();
  if (!fastLLMId) return null;
  try {
-    const chatResponse = await callChatGenerate(fastLLMId, [
+    const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
      { role: 'system', content: simpleImagineSystemPrompt },
      { role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
-    ]);
+    ], null, null);
    return chatResponse.content?.trim() ?? null;
  } catch (error: any) {
    console.error('imaginePromptFromText: fetch request error:', error);
@@ -5,7 +5,7 @@
 import { DLLMId } from '~/modules/llms/store-llms';
 import { callApiSearchGoogle } from '~/modules/google/search.client';
 import { callBrowseFetchPage } from '~/modules/browse/browse.client';
-import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 // prompt to implement the ReAct paradigm: https://arxiv.org/abs/2210.03629
@@ -128,7 +128,7 @@ export class Agent {
    S.messages.push({ role: 'user', content: prompt });
    let content: string;
    try {
-      content = (await callChatGenerate(llmId, S.messages, 500)).content;
+      content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
    } catch (error: any) {
      content = `Error in callChat: ${error}`;
    }
@@ -1,5 +1,5 @@
 import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';


 // prompt to be tried when doing recursive summerization.
@@ -80,10 +80,10 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
  const autoResponseTokensSize = Math.floor(contextTokens * outputTokenShare);

  try {
-    const chatResponse = await callChatGenerate(llmId, [
+    const chatResponse = await llmChatGenerateOrThrow(llmId, [
      { role: 'system', content: cleanupPrompt },
      { role: 'user', content: chunk },
-    ], autoResponseTokensSize);
+    ], null, null, autoResponseTokensSize);
    return chatResponse?.content ?? '';
  } catch (error: any) {
    return '';
@@ -1,8 +1,7 @@
 import * as React from 'react';

 import type { DLLMId } from '~/modules/llms/store-llms';
-import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-import { streamChat } from '~/modules/llms/transports/streamChat';
+import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';


 export function useStreamChatText() {
@@ -25,7 +24,7 @@ export function useStreamChatText() {

    try {
      let lastText = '';
-      await streamChat(llmId, prompt, abortControllerRef.current.signal, (update) => {
+      await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, (update) => {
        if (update.text) {
          lastText = update.text;
          setPartialText(lastText);
@@ -28,6 +28,7 @@ export const backendRouter = createTRPCRouter({
        hasImagingProdia: !!env.PRODIA_API_KEY,
        hasLlmAnthropic: !!env.ANTHROPIC_API_KEY,
        hasLlmAzureOpenAI: !!env.AZURE_OPENAI_API_KEY && !!env.AZURE_OPENAI_API_ENDPOINT,
+        hasLlmGemini: !!env.GEMINI_API_KEY,
        hasLlmMistral: !!env.MISTRAL_API_KEY,
        hasLlmOllama: !!env.OLLAMA_API_HOST,
        hasLlmOpenAI: !!env.OPENAI_API_KEY || !!env.OPENAI_API_HOST,
@@ -42,7 +43,7 @@ export const backendRouter = createTRPCRouter({
  /* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
  exchangeOpenRouterKey: publicProcedure
    .input(z.object({ code: z.string() }))
-    .query(async ({ ctx, input }) => {
+    .query(async ({ input }) => {
      // Documented here: https://openrouter.ai/docs#oauth
      return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
        code: input.code,
@@ -9,6 +9,7 @@ export interface BackendCapabilities {
  hasImagingProdia: boolean;
  hasLlmAnthropic: boolean;
  hasLlmAzureOpenAI: boolean;
+  hasLlmGemini: boolean;
  hasLlmMistral: boolean;
  hasLlmOllama: boolean;
  hasLlmOpenAI: boolean;
@@ -31,6 +32,7 @@ const useBackendStore = create<BackendStore>()(
    hasImagingProdia: false,
    hasLlmAnthropic: false,
    hasLlmAzureOpenAI: false,
+    hasLlmGemini: false,
    hasLlmMistral: false,
    hasLlmOllama: false,
    hasLlmOpenAI: false,
@@ -1,4 +1,4 @@
-import create from 'zustand';
+import { create } from 'zustand';
 import { persist } from 'zustand/middleware';

 import { CapabilityBrowsing } from '~/common/components/useCapabilities';
@@ -0,0 +1,74 @@
+import type { DLLMId } from './store-llms';
+import type { OpenAIWire } from './server/openai/openai.wiretypes';
+import { findVendorForLlmOrThrow } from './vendors/vendors.registry';
+
+
+// LLM Client Types
+// NOTE: Model List types in '../server/llm.server.types';
+
+export interface VChatMessageIn {
+  role: 'assistant' | 'system' | 'user'; // | 'function';
+  content: string;
+  //name?: string; // when role: 'function'
+}
+
+export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
+
+export interface VChatMessageOut {
+  role: 'assistant' | 'system' | 'user';
+  content: string;
+  finish_reason: 'stop' | 'length' | null;
+}
+
+export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
+  function_name: string;
+  function_arguments: object | null;
+}
+
+
+// LLM Client Functions
+
+export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  llmId: DLLMId,
+  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+  maxTokens?: number,
+): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {
+
+  // id to DLLM and vendor
+  const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
+
+  // FIXME: relax the forced cast
+  const options = llm.options as TLLMOptions;
+
+  // get the access
+  const partialSourceSetup = llm._source.setup;
+  const access = vendor.getTransportAccess(partialSourceSetup);
+
+  // execute via the vendor
+  return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
+}
+
+
+export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  llmId: DLLMId,
+  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null,
+  forceFunctionName: string | null,
+  abortSignal: AbortSignal,
+  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
+): Promise<void> {
+
+  // id to DLLM and vendor
+  const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
+
+  // FIXME: relax the forced cast
+  const llmOptions = llm.options as TLLMOptions;
+
+  // get the access
+  const partialSourceSetup = llm._source.setup;
+  const access = vendor.getTransportAccess(partialSourceSetup); // as ChatStreamInputSchema['access'];
+
+  // execute via the vendor
+  return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
+}
@@ -117,9 +117,9 @@ export function LLMOptionsModal(props: { id: DLLMId }) {
        <FormLabelStart title='Details' sx={{ minWidth: 80 }} onClick={() => setShowDetails(!showDetails)} />
        {showDetails && <Typography level='body-sm' sx={{ display: 'block' }}>
          [{llm.id}]: {llm.options.llmRef && `${llm.options.llmRef} · `}
-          {llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
-          {llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
-          {llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
+          {!!llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
+          {!!llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
+          {!!llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
          description: {llm.description}
          {/*· tags: {llm.tags.join(', ')}*/}
        </Typography>}
@@ -111,7 +111,13 @@ export function ModelsList(props: {
      pl: { xs: 0, md: 1 },
      overflowY: 'auto',
    }}>
-      {items}
+      {items.length > 0 ? items : (
+        <ListItem>
+          <Typography level='body-sm'>
+            Please configure the service and update the list of models.
+          </Typography>
+        </ListItem>
+      )}
    </List>
  );
 }
@@ -65,7 +65,7 @@ export function ModelsModal(props: { suspendAutoModelsSetup?: boolean }) {
      title={<>Configure <b>AI Models</b></>}
      startButton={
        multiSource ? <Checkbox
-          label='all vendors' sx={{ my: 'auto' }}
+          label='All Services' sx={{ my: 'auto' }}
          checked={showAllSources} onChange={() => setShowAllSources(all => !all)}
        /> : undefined
      }
@@ -5,9 +5,9 @@ import { Avatar, Badge, Box, Button, IconButton, ListItemDecorator, MenuItem, Op
 import AddIcon from '@mui/icons-material/Add';
 import DeleteOutlineIcon from '@mui/icons-material/DeleteOutline';

-import { type DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
-import { type IModelVendor, type ModelVendorId } from '~/modules/llms/vendors/IModelVendor';
-import { createModelSourceForVendor, findAllVendors, findVendorById } from '~/modules/llms/vendors/vendors.registry';
+import type { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
+import { DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
+import { createModelSourceForVendor, findAllVendors, findVendorById, ModelVendorId } from '~/modules/llms/vendors/vendors.registry';

 import { CloseableMenu } from '~/common/components/CloseableMenu';
 import { ConfirmationModal } from '~/common/components/ConfirmationModal';
@@ -1,6 +1,6 @@
-import type { ModelDescriptionSchema } from '../server.schemas';
+import type { ModelDescriptionSchema } from '../llm.server.types';

-import { LLM_IF_OAI_Chat } from '../../../store-llms';
+import { LLM_IF_OAI_Chat } from '../../store-llms';

 const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);

@@ -6,7 +6,7 @@ import { env } from '~/server/env.mjs';
 import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';

 import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { listModelsOutputSchema } from '../server.schemas';
+import { listModelsOutputSchema } from '../llm.server.types';

 import { AnthropicWire } from './anthropic.wiretypes';
 import { hardcodedAnthropicModels } from './anthropic.models';
@@ -0,0 +1,216 @@
+import { z } from 'zod';
+import { TRPCError } from '@trpc/server';
+import { env } from '~/server/env.mjs';
+
+import packageJson from '../../../../../package.json';
+
+import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
+import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
+
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
+
+import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
+
+import { GeminiBlockSafetyLevel, geminiBlockSafetyLevelSchema, GeminiContentSchema, GeminiGenerateContentRequest, geminiGeneratedContentResponseSchema, geminiModelsGenerateContentPath, geminiModelsListOutputSchema, geminiModelsListPath } from './gemini.wiretypes';
+
+
+// Default hosts
+const DEFAULT_GEMINI_HOST = 'https://generativelanguage.googleapis.com';
+
+
+// Mappers
+
+export function geminiAccess(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string): { headers: HeadersInit, url: string } {
+
+  const geminiKey = access.geminiKey || env.GEMINI_API_KEY || '';
+  const geminiHost = fixupHost(DEFAULT_GEMINI_HOST, apiPath);
+
+  // update model-dependent paths
+  if (apiPath.includes('{model=models/*}')) {
+    if (!modelRefId)
+      throw new Error(`geminiAccess: modelRefId is required for ${apiPath}`);
+    apiPath = apiPath.replace('{model=models/*}', modelRefId);
+  }
+
+  return {
+    headers: {
+      'Content-Type': 'application/json',
+      'x-goog-api-client': `big-agi/${packageJson['version'] || '1.0.0'}`,
+      'x-goog-api-key': geminiKey,
+    },
+    url: geminiHost + apiPath,
+  };
+}
+
+/**
+ * We specially encode the history to match the Gemini API requirements.
+ * Gemini does not want 2 consecutive messages from the same role, so we alternate.
+ *  - System messages = [User, Model'Ok']
+ *  - User and Assistant messages are coalesced into a single message (e.g. [User, User, Assistant, Assistant, User] -> [User[2], Assistant[2], User[1]])
+ */
+export const geminiGenerateContentTextPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, safety: GeminiBlockSafetyLevel, n: number): GeminiGenerateContentRequest => {
+
+  // convert the history to a Gemini format
+  const contents: GeminiContentSchema[] = [];
+  for (const _historyElement of history) {
+
+    const { role: msgRole, content: msgContent } = _historyElement;
+
+    // System message - we treat it as per the example in https://ai.google.dev/tutorials/ai-studio_quickstart#chat_example
+    if (msgRole === 'system') {
+      contents.push({ role: 'user', parts: [{ text: msgContent }] });
+      contents.push({ role: 'model', parts: [{ text: 'Ok' }] });
+      continue;
+    }
+
+    // User or Assistant message
+    const nextRole: GeminiContentSchema['role'] = msgRole === 'assistant' ? 'model' : 'user';
+    if (contents.length && contents[contents.length - 1].role === nextRole) {
+      // coalesce with the previous message
+      contents[contents.length - 1].parts.push({ text: msgContent });
+    } else {
+      // create a new message
+      contents.push({ role: nextRole, parts: [{ text: msgContent }] });
+    }
+  }
+
+  return {
+    contents,
+    generationConfig: {
+      ...(n >= 2 && { candidateCount: n }),
+      ...(model.maxTokens && { maxOutputTokens: model.maxTokens }),
+      temperature: model.temperature,
+    },
+    safetySettings: safety !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED' ? [
+      { category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: safety },
+      { category: 'HARM_CATEGORY_HATE_SPEECH', threshold: safety },
+      { category: 'HARM_CATEGORY_HARASSMENT', threshold: safety },
+      { category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: safety },
+    ] : undefined,
+  };
+};
+
+
+async function geminiGET<TOut extends object>(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+  const { headers, url } = geminiAccess(access, modelRefId, apiPath);
+  return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, 'Gemini');
+}
+
+async function geminiPOST<TOut extends object, TPostBody extends object>(access: GeminiAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+  const { headers, url } = geminiAccess(access, modelRefId, apiPath);
+  return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, 'Gemini');
+}
+
+
+// Input/Output Schemas
+
+export const geminiAccessSchema = z.object({
+  dialect: z.enum(['gemini']),
+  geminiKey: z.string(),
+  minSafetyLevel: geminiBlockSafetyLevelSchema,
+});
+export type GeminiAccessSchema = z.infer<typeof geminiAccessSchema>;
+
+
+const accessOnlySchema = z.object({
+  access: geminiAccessSchema,
+});
+
+const chatGenerateInputSchema = z.object({
+  access: geminiAccessSchema,
+  model: openAIModelSchema, history: openAIHistorySchema,
+  // functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
+});
+
+
+/**
+ * See https://github.com/google/generative-ai-js/tree/main/packages/main/src for
+ * the official Google implementation.
+ */
+export const llmGeminiRouter = createTRPCRouter({
+
+  /* [Gemini] models.list = /v1beta/models */
+  listModels: publicProcedure
+    .input(accessOnlySchema)
+    .output(listModelsOutputSchema)
+    .query(async ({ input }) => {
+
+      // get the models
+      const wireModels = await geminiGET(input.access, null, geminiModelsListPath);
+      const detailedModels = geminiModelsListOutputSchema.parse(wireModels).models;
+
+      // NOTE: no need to retrieve info for each of the models (e.g. /v1beta/model/gemini-pro).,
+      //       as the List API already all the info on all the models
+
+      // map to our output schema
+      return {
+        models: detailedModels.map((geminiModel) => {
+          const { description, displayName, inputTokenLimit, name, outputTokenLimit, supportedGenerationMethods } = geminiModel;
+
+          const contextWindow = inputTokenLimit + outputTokenLimit;
+          const hidden = !supportedGenerationMethods.includes('generateContent');
+
+          const { version, topK, topP, temperature } = geminiModel;
+          const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;
+
+          // const isGeminiPro = name.includes('gemini-pro');
+          const isGeminiProVision = name.includes('gemini-pro-vision');
+
+          const interfaces: ModelDescriptionSchema['interfaces'] = [];
+          if (supportedGenerationMethods.includes('generateContent')) {
+            interfaces.push(LLM_IF_OAI_Chat);
+            if (isGeminiProVision)
+              interfaces.push(LLM_IF_OAI_Vision);
+          }
+
+          return {
+            id: name,
+            label: displayName,
+            // created: ...
+            // updated: ...
+            description: descriptionLong,
+            contextWindow: contextWindow,
+            maxCompletionTokens: outputTokenLimit,
+            // pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
+            // rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
+            interfaces: supportedGenerationMethods.includes('generateContent') ? [LLM_IF_OAI_Chat] : [],
+            hidden,
+          } satisfies ModelDescriptionSchema;
+        }),
+      };
+    }),
+
+
+  /* [Gemini] models.generateContent = /v1/{model=models/*}:generateContent */
+  chatGenerate: publicProcedure
+    .input(chatGenerateInputSchema)
+    .output(openAIChatGenerateOutputSchema)
+    .mutation(async ({ input: { access, history, model } }) => {
+
+      // generate the content
+      const wireGeneration = await geminiPOST(access, model.id, geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1), geminiModelsGenerateContentPath);
+      const generation = geminiGeneratedContentResponseSchema.parse(wireGeneration);
+
+      // only use the first result (and there should be only one)
+      const singleCandidate = generation.candidates?.[0] ?? null;
+      if (!singleCandidate || !singleCandidate.content?.parts.length)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Gemini chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });
+
+      if (!('text' in singleCandidate.content.parts[0]))
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Gemini non-text chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });
+
+      return {
+        role: 'assistant',
+        content: singleCandidate.content.parts[0].text || '',
+        finish_reason: singleCandidate.finishReason === 'STOP' ? 'stop' : null,
+      };
+    }),
+
+});
@@ -0,0 +1,188 @@
+import { z } from 'zod';
+
+// PATHS
+
+export const geminiModelsListPath = '/v1beta/models?pageSize=1000';
+export const geminiModelsGenerateContentPath = '/v1beta/{model=models/*}:generateContent';
+// see alt=sse on https://cloud.google.com/apis/docs/system-parameters#definitions
+export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:streamGenerateContent?alt=sse';
+
+
+// models.list = /v1beta/models
+
+export const geminiModelsListOutputSchema = z.object({
+  models: z.array(z.object({
+    name: z.string(),
+    version: z.string(),
+    displayName: z.string(),
+    description: z.string(),
+    inputTokenLimit: z.number().int().min(1),
+    outputTokenLimit: z.number().int().min(1),
+    supportedGenerationMethods: z.array(z.enum([
+      'countMessageTokens',
+      'countTextTokens',
+      'countTokens',
+      'createTunedTextModel',
+      'embedContent',
+      'embedText',
+      'generateAnswer',
+      'generateContent',
+      'generateMessage',
+      'generateText',
+    ])),
+    temperature: z.number().optional(),
+    topP: z.number().optional(),
+    topK: z.number().optional(),
+  })),
+});
+
+
+// /v1/{model=models/*}:generateContent, /v1beta/{model=models/*}:streamGenerateContent
+
+// Request
+
+const geminiContentPartSchema = z.union([
+
+  // TextPart
+  z.object({
+    text: z.string().optional(),
+  }),
+
+  // InlineDataPart
+  z.object({
+    inlineData: z.object({
+      mimeType: z.string(),
+      data: z.string(), // base64-encoded string
+    }),
+  }),
+
+  // A predicted FunctionCall returned from the model
+  z.object({
+    functionCall: z.object({
+      name: z.string(),
+      args: z.record(z.any()), // JSON object format
+    }),
+  }),
+
+  // The result output of a FunctionCall
+  z.object({
+    functionResponse: z.object({
+      name: z.string(),
+      response: z.record(z.any()), // JSON object format
+    }),
+  }),
+]);
+
+const geminiToolSchema = z.object({
+  functionDeclarations: z.array(z.object({
+    name: z.string(),
+    description: z.string(),
+    parameters: z.record(z.any()).optional(), // Schema object format
+  })).optional(),
+});
+
+const geminiHarmCategorySchema = z.enum([
+  'HARM_CATEGORY_UNSPECIFIED',
+  'HARM_CATEGORY_DEROGATORY',
+  'HARM_CATEGORY_TOXICITY',
+  'HARM_CATEGORY_VIOLENCE',
+  'HARM_CATEGORY_SEXUAL',
+  'HARM_CATEGORY_MEDICAL',
+  'HARM_CATEGORY_DANGEROUS',
+  'HARM_CATEGORY_HARASSMENT',
+  'HARM_CATEGORY_HATE_SPEECH',
+  'HARM_CATEGORY_SEXUALLY_EXPLICIT',
+  'HARM_CATEGORY_DANGEROUS_CONTENT',
+]);
+
+export const geminiBlockSafetyLevelSchema = z.enum([
+  'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  'BLOCK_LOW_AND_ABOVE',
+  'BLOCK_MEDIUM_AND_ABOVE',
+  'BLOCK_ONLY_HIGH',
+  'BLOCK_NONE',
+]);
+
+export type GeminiBlockSafetyLevel = z.infer<typeof geminiBlockSafetyLevelSchema>;
+
+const geminiSafetySettingSchema = z.object({
+  category: geminiHarmCategorySchema,
+  threshold: geminiBlockSafetyLevelSchema,
+});
+
+const geminiGenerationConfigSchema = z.object({
+  stopSequences: z.array(z.string()).optional(),
+  candidateCount: z.number().int().optional(),
+  maxOutputTokens: z.number().int().optional(),
+  temperature: z.number().optional(),
+  topP: z.number().optional(),
+  topK: z.number().int().optional(),
+});
+
+const geminiContentSchema = z.object({
+  // Must be either 'user' or 'model'. Optional but must be set if there are multiple "Content" objects in the parent array.
+  role: z.enum(['user', 'model']).optional(),
+  // Ordered Parts that constitute a single message. Parts may have different MIME types.
+  parts: z.array(geminiContentPartSchema),
+});
+
+export type GeminiContentSchema = z.infer<typeof geminiContentSchema>;
+
+export const geminiGenerateContentRequest = z.object({
+  contents: z.array(geminiContentSchema),
+  tools: z.array(geminiToolSchema).optional(),
+  safetySettings: z.array(geminiSafetySettingSchema).optional(),
+  generationConfig: geminiGenerationConfigSchema.optional(),
+});
+
+export type GeminiGenerateContentRequest = z.infer<typeof geminiGenerateContentRequest>;
+
+
+// Response
+
+const geminiHarmProbabilitySchema = z.enum([
+  'HARM_PROBABILITY_UNSPECIFIED',
+  'NEGLIGIBLE',
+  'LOW',
+  'MEDIUM',
+  'HIGH',
+]);
+
+const geminiSafetyRatingSchema = z.object({
+  'category': geminiHarmCategorySchema,
+  'probability': geminiHarmProbabilitySchema,
+  'blocked': z.boolean().optional(),
+});
+
+const geminiFinishReasonSchema = z.enum([
+  'FINISH_REASON_UNSPECIFIED',
+  'STOP',
+  'MAX_TOKENS',
+  'SAFETY',
+  'RECITATION',
+  'OTHER',
+]);
+
+export const geminiGeneratedContentResponseSchema = z.object({
+  // either all requested candidates are returned or no candidates at all
+  // no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
+  candidates: z.array(z.object({
+    index: z.number(),
+    content: geminiContentSchema,
+    finishReason: geminiFinishReasonSchema.optional(),
+    safetyRatings: z.array(geminiSafetyRatingSchema),
+    citationMetadata: z.object({
+      startIndex: z.number().optional(),
+      endIndex: z.number().optional(),
+      uri: z.string().optional(),
+      license: z.string().optional(),
+    }).optional(),
+    tokenCount: z.number().optional(),
+    // groundingAttributions: z.array(GroundingAttribution).optional(), // This field is populated for GenerateAnswer calls.
+  })).optional(),
+  // NOTE: promptFeedback is only send in the first chunk in a streaming response
+  promptFeedback: z.object({
+    blockReason: z.enum(['BLOCK_REASON_UNSPECIFIED', 'SAFETY', 'OTHER']).optional(),
+    safetyRatings: z.array(geminiSafetyRatingSchema).optional(),
+  }).optional(),
+});
@@ -4,12 +4,30 @@ import { createParser as createEventsourceParser, EventSourceParseCallback, Even

 import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE, serverFetchOrThrow } from '~/server/wire';

-import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
-import type { OpenAIWire } from './openai.wiretypes';
-import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
-import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
-import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
-import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
+
+// Anthropic server imports
+import type { AnthropicWire } from './anthropic/anthropic.wiretypes';
+import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from './anthropic/anthropic.router';
+
+// Gemini server imports
+import { geminiAccess, geminiAccessSchema, geminiGenerateContentTextPayload } from './gemini/gemini.router';
+import { geminiGeneratedContentResponseSchema, geminiModelsStreamGenerateContentPath } from './gemini/gemini.wiretypes';
+
+// Ollama server imports
+import { wireOllamaChunkedOutputSchema } from './ollama/ollama.wiretypes';
+import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from './ollama/ollama.router';
+
+// OpenAI server imports
+import type { OpenAIWire } from './openai/openai.wiretypes';
+import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
+
+
+/**
+ * Event stream formats
+ *  - 'sse' is the default format, and is used by all vendors except Ollama
+ *  - 'json-nl' is used by Ollama
+ */
+type MuxingFormat = 'sse' | 'json-nl';


 /**
@@ -20,49 +38,58 @@ import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
 * The peculiarity of our parser is the injection of a JSON structure at the beginning of the stream, to
 * communicate parameters before the text starts flowing to the client.
 */
-export type AIStreamParser = (data: string) => { text: string, close: boolean };
-
-type EventStreamFormat = 'sse' | 'json-nl';
+type AIStreamParser = (data: string) => { text: string, close: boolean };


-const chatStreamInputSchema = z.object({
-  access: z.union([anthropicAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
-  model: openAIModelSchema, history: openAIHistorySchema,
+const chatStreamingInputSchema = z.object({
+  access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
 });
-export type ChatStreamInputSchema = z.infer<typeof chatStreamInputSchema>;
+export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;

-const chatStreamFirstPacketSchema = z.object({
+const chatStreamingFirstOutputPacketSchema = z.object({
  model: z.string(),
 });
-export type ChatStreamFirstPacketSchema = z.infer<typeof chatStreamFirstPacketSchema>;
+export type ChatStreamingFirstOutputPacketSchema = z.infer<typeof chatStreamingFirstOutputPacketSchema>;


-export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Response> {
+export async function llmStreamingRelayHandler(req: NextRequest): Promise<Response> {

  // inputs - reuse the tRPC schema
-  const { access, model, history } = chatStreamInputSchema.parse(await req.json());
+  const body = await req.json();
+  const { access, model, history } = chatStreamingInputSchema.parse(body);

-  // begin event streaming from the OpenAI API
-  let headersUrl: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
+  // access/dialect dependent setup:
+  //  - requestAccess: the headers and URL to use for the upstream API call
+  //  - muxingFormat: the format of the event stream (sse or json-nl)
+  //  - vendorStreamParser: the parser to use for the event stream
  let upstreamResponse: Response;
+  let requestAccess: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
+  let muxingFormat: MuxingFormat = 'sse';
  let vendorStreamParser: AIStreamParser;
-  let eventStreamFormat: EventStreamFormat = 'sse';
  try {

    // prepare the API request data
    let body: object;
    switch (access.dialect) {
      case 'anthropic':
-        headersUrl = anthropicAccess(access, '/v1/complete');
+        requestAccess = anthropicAccess(access, '/v1/complete');
        body = anthropicChatCompletionPayload(model, history, true);
-        vendorStreamParser = createAnthropicStreamParser();
+        vendorStreamParser = createStreamParserAnthropic();
+        break;
+
+      case 'gemini':
+        requestAccess = geminiAccess(access, model.id, geminiModelsStreamGenerateContentPath);
+        body = geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1);
+        vendorStreamParser = createStreamParserGemini(model.id.replace('models/', ''));
        break;

      case 'ollama':
-        headersUrl = ollamaAccess(access, OLLAMA_PATH_CHAT);
+        requestAccess = ollamaAccess(access, OLLAMA_PATH_CHAT);
        body = ollamaChatCompletionPayload(model, history, true);
-        eventStreamFormat = 'json-nl';
-        vendorStreamParser = createOllamaChatCompletionStreamParser();
+        muxingFormat = 'json-nl';
+        vendorStreamParser = createStreamParserOllama();
        break;

      case 'azure':
@@ -71,27 +98,27 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
      case 'oobabooga':
      case 'openai':
      case 'openrouter':
-        headersUrl = openAIAccess(access, model.id, '/v1/chat/completions');
+        requestAccess = openAIAccess(access, model.id, '/v1/chat/completions');
        body = openAIChatCompletionPayload(model, history, null, null, 1, true);
-        vendorStreamParser = createOpenAIStreamParser();
+        vendorStreamParser = createStreamParserOpenAI();
        break;
    }

    if (SERVER_DEBUG_WIRE)
-      console.log('-> streaming:', debugGenerateCurlCommand('POST', headersUrl.url, headersUrl.headers, body));
+      console.log('-> streaming:', debugGenerateCurlCommand('POST', requestAccess.url, requestAccess.headers, body));

    // POST to our API route
-    upstreamResponse = await serverFetchOrThrow(headersUrl.url, 'POST', headersUrl.headers, body);
+    upstreamResponse = await serverFetchOrThrow(requestAccess.url, 'POST', requestAccess.headers, body);

  } catch (error: any) {
    const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + error.cause : '');

    // server-side admins message
-    console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, headersUrl?.url);
+    console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, requestAccess?.url);

    // client-side users visible message
    return new NextResponse(`[Issue] ${access.dialect}: ${fetchOrVendorError}`
-      + (process.env.NODE_ENV === 'development' ? ` · [URL: ${headersUrl?.url}]` : ''), { status: 500 });
+      + (process.env.NODE_ENV === 'development' ? ` · [URL: ${requestAccess?.url}]` : ''), { status: 500 });
  }

  /* The following code is heavily inspired by the Vercel AI SDK, but simplified to our needs and in full control.
@@ -103,8 +130,12 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
   * NOTE: we have not benchmarked to see if there is performance impact by using this approach - we do want to have
   * a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
   */
-  const chatResponseStream = (upstreamResponse.body || createEmptyReadableStream())
-    .pipeThrough(createEventStreamTransformer(vendorStreamParser, eventStreamFormat, access.dialect));
+  const transformUpstreamToBigAgiClient = createEventStreamTransformer(
+    muxingFormat, vendorStreamParser, access.dialect,
+  );
+  const chatResponseStream =
+    (upstreamResponse.body || createEmptyReadableStream())
+      .pipeThrough(transformUpstreamToBigAgiClient);

  return new NextResponse(chatResponseStream, {
    status: 200,
@@ -115,114 +146,44 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
 }


-/// Event Parsers
-
-function createAnthropicStreamParser(): AIStreamParser {
-  let hasBegun = false;
-
-  return (data: string) => {
-
-    const json: AnthropicWire.Complete.Response = JSON.parse(data);
-    let text = json.completion;
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    return { text, close: false };
-  };
-}
-
-function createOllamaChatCompletionStreamParser(): AIStreamParser {
-  let hasBegun = false;
-
-  return (data: string) => {
-
-    // parse the JSON chunk
-    let wireJsonChunk: any;
-    try {
-      wireJsonChunk = JSON.parse(data);
-    } catch (error: any) {
-      // log the malformed data to the console, and rethrow to transmit as 'error'
-      console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
-      throw error;
-    }
-
-    // validate chunk
-    const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
-
-    // pass through errors from Ollama
-    if ('error' in chunk)
-      throw new Error(chunk.error);
-
-    // process output
-    let text = chunk.message?.content || /*chunk.response ||*/ '';
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun && chunk.model) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: chunk.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    return { text, close: chunk.done };
-  };
-}
-
-function createOpenAIStreamParser(): AIStreamParser {
-  let hasBegun = false;
-  let hasWarned = false;
-
-  return (data: string) => {
-
-    const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
-
-    // [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
-    if (json.error)
-      return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
-
-    // [OpenAI] if there's a warning, log it once
-    if (json.warning && !hasWarned) {
-      hasWarned = true;
-      console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
-    }
-
-    if (json.choices.length !== 1) {
-      // [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
-      if (json.id === '' && json.object === '' && json.model === '')
-        return { text: '', close: false };
-      throw new Error(`Expected 1 completion, got ${json.choices.length}`);
-    }
-
-    const index = json.choices[0].index;
-    if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
-      throw new Error(`Expected completion index 0, got ${index}`);
-    let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    // [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
-    const close = !!json.choices[0].finish_reason;
-    return { text, close };
-  };
-}
-
-
 // Event Stream Transformers

+/**
+ * Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
+ * Ollama is the only vendor that uses this format.
+ */
+function createDemuxerJsonNewline(onParse: EventSourceParseCallback): EventSourceParser {
+  let accumulator: string = '';
+  return {
+    // feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
+    feed: (chunk: string): void => {
+      accumulator += chunk;
+      if (accumulator.endsWith('\n')) {
+        for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
+          const mimicEvent: ParsedEvent = {
+            type: 'event',
+            id: undefined,
+            event: undefined,
+            data: jsonString,
+          };
+          onParse(mimicEvent);
+        }
+        accumulator = '';
+      }
+    },
+
+    // resets the parser state - not useful with our driving of the parser
+    reset: (): void => {
+      console.error('createDemuxerJsonNewline.reset() not implemented');
+    },
+  };
+}
+
 /**
 * Creates a TransformStream that parses events from an EventSource stream using a custom parser.
 * @returns {TransformStream<Uint8Array, string>} TransformStream parsing events.
 */
-function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFormat: EventStreamFormat, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
+function createEventStreamTransformer(muxingFormat: MuxingFormat, vendorTextParser: AIStreamParser, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
  const textDecoder = new TextDecoder();
  const textEncoder = new TextEncoder();
  let eventSourceParser: EventSourceParser;
@@ -265,10 +226,10 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
        }
      };

-      if (inputFormat === 'sse')
+      if (muxingFormat === 'sse')
        eventSourceParser = createEventsourceParser(onNewEvent);
-      else if (inputFormat === 'json-nl')
-        eventSourceParser = createJsonNewlineParser(onNewEvent);
+      else if (muxingFormat === 'json-nl')
+        eventSourceParser = createDemuxerJsonNewline(onNewEvent);
    },

    // stream=true is set because the data is not guaranteed to be final and un-chunked
@@ -278,33 +239,142 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
  });
 }

-/**
- * Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
- * Ollama is the only vendor that uses this format.
- */
-function createJsonNewlineParser(onParse: EventSourceParseCallback): EventSourceParser {
-  let accumulator: string = '';
-  return {
-    // feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
-    feed: (chunk: string): void => {
-      accumulator += chunk;
-      if (accumulator.endsWith('\n')) {
-        for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
-          const mimicEvent: ParsedEvent = {
-            type: 'event',
-            id: undefined,
-            event: undefined,
-            data: jsonString,
-          };
-          onParse(mimicEvent);
-        }
-        accumulator = '';
-      }
-    },

-    // resets the parser state - not useful with our driving of the parser
-    reset: (): void => {
-      console.error('createJsonNewlineParser.reset() not implemented');
-    },
+/// Stream Parsers
+
+function createStreamParserAnthropic(): AIStreamParser {
+  let hasBegun = false;
+
+  return (data: string) => {
+
+    const json: AnthropicWire.Complete.Response = JSON.parse(data);
+    let text = json.completion;
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: false };
  };
 }
+
+function createStreamParserGemini(modelName: string): AIStreamParser {
+  let hasBegun = false;
+
+  // this can throw, it's catched upstream
+  return (data: string) => {
+
+    // parse the JSON chunk
+    const wireGenerationChunk = JSON.parse(data);
+    const generationChunk = geminiGeneratedContentResponseSchema.parse(wireGenerationChunk);
+
+    // Prompt Safety Errors: pass through errors from Gemini
+    if (generationChunk.promptFeedback?.blockReason) {
+      const { blockReason, safetyRatings } = generationChunk.promptFeedback;
+      return { text: `[Gemini Prompt Blocked] ${blockReason}: ${JSON.stringify(safetyRatings || 'Unknown Safety Ratings', null, 2)}`, close: true };
+    }
+
+    // expect a single completion
+    const singleCandidate = generationChunk.candidates?.[0] ?? null;
+    if (!singleCandidate || !singleCandidate.content?.parts.length)
+      throw new Error(`Gemini: expected 1 completion, got ${generationChunk.candidates?.length}`);
+
+    // expect a single part
+    if (singleCandidate.content.parts.length !== 1 || !('text' in singleCandidate.content.parts[0]))
+      throw new Error(`Gemini: expected 1 text part, got ${singleCandidate.content.parts.length}`);
+
+    // expect a single text in the part
+    let text = singleCandidate.content.parts[0].text || '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: modelName };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: false };
+  };
+}
+
+function createStreamParserOllama(): AIStreamParser {
+  let hasBegun = false;
+
+  return (data: string) => {
+
+    // parse the JSON chunk
+    let wireJsonChunk: any;
+    try {
+      wireJsonChunk = JSON.parse(data);
+    } catch (error: any) {
+      // log the malformed data to the console, and rethrow to transmit as 'error'
+      console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
+      throw error;
+    }
+
+    // validate chunk
+    const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
+
+    // pass through errors from Ollama
+    if ('error' in chunk)
+      throw new Error(chunk.error);
+
+    // process output
+    let text = chunk.message?.content || /*chunk.response ||*/ '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun && chunk.model) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: chunk.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: chunk.done };
+  };
+}
+
+function createStreamParserOpenAI(): AIStreamParser {
+  let hasBegun = false;
+  let hasWarned = false;
+
+  return (data: string) => {
+
+    const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
+
+    // [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
+    if (json.error)
+      return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
+
+    // [OpenAI] if there's a warning, log it once
+    if (json.warning && !hasWarned) {
+      hasWarned = true;
+      console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
+    }
+
+    if (json.choices.length !== 1) {
+      // [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
+      if (json.id === '' && json.object === '' && json.model === '')
+        return { text: '', close: false };
+      throw new Error(`Expected 1 completion, got ${json.choices.length}`);
+    }
+
+    const index = json.choices[0].index;
+    if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
+      throw new Error(`Expected completion index 0, got ${index}`);
+    let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    // [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
+    const close = !!json.choices[0].finish_reason;
+    return { text, close };
+  };
+}
@@ -1,11 +1,18 @@
 import { z } from 'zod';
-import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../store-llms';
+
+
+// Model Description: a superset of LLM model descriptors

 const pricingSchema = z.object({
  cpmPrompt: z.number().optional(), // Cost per thousand prompt tokens
  cpmCompletion: z.number().optional(), // Cost per thousand completion tokens
 });

+// const rateLimitsSchema = z.object({
+//   reqPerMinute: z.number().optional(),
+// });
+
 const modelDescriptionSchema = z.object({
  id: z.string(),
  label: z.string(),
@@ -15,9 +22,12 @@ const modelDescriptionSchema = z.object({
  contextWindow: z.number(),
  maxCompletionTokens: z.number().optional(),
  pricing: pricingSchema.optional(),
+  // rateLimits: rateLimitsSchema.optional(),
  interfaces: z.array(z.enum([LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Complete, LLM_IF_OAI_Vision])),
  hidden: z.boolean().optional(),
 });
+
+// this is also used by the Client
 export type ModelDescriptionSchema = z.infer<typeof modelDescriptionSchema>;

 export const listModelsOutputSchema = z.object({
@@ -6,54 +6,59 @@
 * from: https://ollama.ai/library?sort=featured
 */
 export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
-  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 2353, added: '20231129' },
-  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 3089, added: '20231129' },
-  'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 70300 },
-  'yi': { description: 'A high-performing, bilingual base model.', pulls: 2673 },
-  'llama2': { description: 'The most popular model for general use.', pulls: 141000 },
-  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 71400 },
-  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 30900 },
-  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 26000 },
-  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 21800 },
-  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 13700 },
-  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 10600 },
-  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 10200 },
-  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9895 },
-  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9256 },
-  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8827 },
-  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7849 },
-  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7375 },
-  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 7335, added: '20231129' },
-  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 6726 },
-  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6272 },
-  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5978 },
-  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 5854, added: '20231129' },
-  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5040 },
-  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4648 },
-  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4536 },
-  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 3638 },
-  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 3638 },
-  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3485 },
-  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 3438, added: '20231129' },
-  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3145 },
-  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3023 },
-  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2775 },
-  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2192 },
-  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 1973 },
-  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1915 },
-  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1690 },
-  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 1667, added: '20231129' },
-  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1379 },
-  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1345 },
-  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1318, added: '20231129' },
-  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1302 },
-  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1254 },
-  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 946, added: '20231129' },
-  'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 945, added: '20231210' },
-  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 860 },
-  'magicoder': { description: '🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.', pulls: 816, added: '20231210' },
-  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 804, added: '20231129' },
-  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 706 },
+  'llama2': { description: 'The most popular model for general use.', pulls: 165600 },
+  'mistral': { description: 'The 7B model released by Mistral AI, updated to version 0.2', pulls: 92200 },
+  'llava': { description: '🌋 A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding.', pulls: 3563, added: '20231215' },
+  'mixtral': { description: 'A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.', pulls: 8277, added: '20231215' },
+  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 3657, added: '20231129' },
+  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 4647, added: '20231129' },
+  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 79800 },
+  'dolphin-mixtral': { description: 'An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks. Created by Eric Hartford.', pulls: 48400, added: '20231215' },
+  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 36600 },
+  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 30000 },
+  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 22700 },
+  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 15300 },
+  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 11500 },
+  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 11200 },
+  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 10700 },
+  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 10200 },
+  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9842 },
+  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 9071 },
+  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 8328 },
+  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 8111 },
+  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 7492, added: '20231129' },
+  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 7468 },
+  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6468 },
+  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 6397 },
+  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5347 },
+  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 5034 },
+  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4874 },
+  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 4686 },
+  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-1210.', pulls: 4496, added: '20231129' },
+  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 4331 },
+  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3722 },
+  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3668 },
+  'yi': { description: 'A high-performing, bilingual base model.', pulls: 3335 },
+  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3219 },
+  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 3087 },
+  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2518 },
+  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 2338 },
+  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 2216, added: '20231129' },
+  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 2201 },
+  'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 1983, added: '20231210' },
+  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1790 },
+  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1732, added: '20231129' },
+  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1598 },
+  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1534 },
+  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1454 },
+  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1418 },
+  'phi': { description: 'Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.', pulls: 1304, added: '20231220' },
+  'bakllava': { description: 'BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.', pulls: 1189, added: '20231215' },
+  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 1140, added: '20231129' },
+  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 1060 },
+  'solar': { description: 'A compact, yet powerful 10.7B large language model designed for single-turn conversation.', pulls: 934 },
+  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 902, added: '20231129' },
+  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 868 },
 };
-// export const OLLAMA_LAST_UPDATE: string = '20231210';
-export const OLLAMA_PREV_UPDATE: string = '20231129';
+// export const OLLAMA_LAST_UPDATE: string = '20231220';
+export const OLLAMA_PREV_UPDATE: string = '20231210';
@@ -5,12 +5,12 @@ import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
 import { fetchJsonOrTRPCError, fetchTextOrTRPCError } from '~/server/api/trpc.serverutils';

-import { LLM_IF_OAI_Chat } from '../../../store-llms';
+import { LLM_IF_OAI_Chat } from '../../store-llms';

 import { capitalizeFirstLetter } from '~/common/util/textUtils';

 import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';

 import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
 import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';
@@ -1,8 +1,8 @@
 import { SERVER_DEBUG_WIRE } from '~/server/wire';

-import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';

-import type { ModelDescriptionSchema } from '../server.schemas';
+import type { ModelDescriptionSchema } from '../llm.server.types';
 import { wireMistralModelsListOutputSchema } from './mistral.wiretypes';


@@ -313,16 +313,16 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
  'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
  'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
-  'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32000, cp: 0, cc: 0, unfilt: true },
-  'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32000, cp: 0, cc: 0, unfilt: true },
+  'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32768, cp: 0, cc: 0, unfilt: true },
  'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
  'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
-  'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
-  'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
+  'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
+  'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
  'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
  'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
  'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
  'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32768, cp: 0.0003, cc: 0.0003, unfilt: true },
  'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
  'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
  'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
@@ -334,10 +334,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'openai/gpt-4-32k': { name: 'OpenAI: GPT-4 32k', cw: 32767, cp: 0.06, cc: 0.12, unfilt: false },
  'openai/gpt-4-vision-preview': { name: 'OpenAI: GPT-4 Vision (preview)', cw: 128000, cp: 0.01, cc: 0.03, unfilt: false },
  'openai/gpt-3.5-turbo-instruct': { name: 'OpenAI: GPT-3.5 Turbo Instruct', cw: 4095, cp: 0.0015, cc: 0.002, unfilt: false },
-  'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 9216, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
+  'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 36864, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 28672, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/gemini-pro': { name: 'Google: Gemini Pro (preview)', cw: 131040, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/gemini-pro-vision': { name: 'Google: Gemini Pro Vision (preview)', cw: 65536, cp: 0.00025, cc: 0.0005, unfilt: true },
  'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
  'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
  'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
@@ -347,7 +349,7 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
  'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
  'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
-  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0001425006, cc: 0.0001425006, unfilt: true },
  'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
  'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
  'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
@@ -361,9 +363,9 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  '01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
  '01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
  '01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
-  'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32000, cp: 0.0006, cc: 0.0006, unfilt: true },
+  'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32768, cp: 0.0006, cc: 0.0006, unfilt: true },
  'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
@@ -382,10 +384,10 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
 };

 const orModelFamilyOrder = [
-  // great models
-  'mistralai/mixtral-8x7b-instruct', 'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
+  // great models (pickes by hand, they're free)
+  'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
  // great orgs
-  'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/',
+  'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'mistralai/', 'openai/', 'meta-llama/', 'phind/',
 ];

 export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
@@ -8,7 +8,7 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
 import { Brand } from '~/common/app.config';

 import type { OpenAIWire } from './openai.wiretypes';
-import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
 import { localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';


@@ -2,7 +2,7 @@ import { create } from 'zustand';
 import { shallow } from 'zustand/shallow';
 import { persist } from 'zustand/middleware';

-import type { IModelVendor, ModelVendorId } from './vendors/IModelVendor';
+import type { ModelVendorId } from './vendors/vendors.registry';
 import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';


@@ -16,6 +16,7 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
  updated?: number | 0;
  description: string;
  tags: string[]; // UNUSED for now
+  // modelcaps: DModelCapability[];
  contextTokens: number;
  maxOutputTokens: number;
  hidden: boolean;
@@ -30,6 +31,17 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {

 export type DLLMId = string;

+// export type DModelCapability =
+//   | 'input-text'
+//   | 'input-image-data'
+//   | 'input-multipart'
+//   | 'output-text'
+//   | 'output-function'
+//   | 'output-image-data'
+//   | 'if-chat'
+//   | 'if-fast-chat'
+//   ;
+
 // Model interfaces (chat, and function calls) - here as a preview, will be used more broadly in the future
 export const LLM_IF_OAI_Chat = 'oai-chat';
 export const LLM_IF_OAI_Vision = 'oai-vision';
@@ -269,32 +281,3 @@ export function useChatLLM() {
  }, shallow);
 }

-/**
- * Source-specific read/write - great time saver
- */
-export function useSourceSetup<TSourceSetup, TAccess>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess>) {
-
-  // invalidates only when the setup changes
-  const { updateSourceSetup, ...rest } = useModelsStore(state => {
-
-    // find the source (or null)
-    const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
-
-    // (safe) source-derived properties
-    const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
-    const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
-    const access = vendor.getTransportAccess(source?.setup);
-
-    return {
-      source,
-      access,
-      sourceHasLLMs: !!sourceLLMs.length,
-      sourceSetupValid,
-      updateSourceSetup: state.updateSourceSetup,
-    };
-  }, shallow);
-
-  // convenience function for this source
-  const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
-  return { ...rest, updateSetup };
-}
@@ -1,34 +0,0 @@
-import type { DLLMId } from '../store-llms';
-import type { OpenAIWire } from './server/openai/openai.wiretypes';
-import { findVendorForLlmOrThrow } from '../vendors/vendors.registry';
-
-
-export interface VChatMessageIn {
-  role: 'assistant' | 'system' | 'user'; // | 'function';
-  content: string;
-  //name?: string; // when role: 'function'
-}
-
-export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
-
-export interface VChatMessageOut {
-  role: 'assistant' | 'system' | 'user';
-  content: string;
-  finish_reason: 'stop' | 'length' | null;
-}
-
-export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
-  function_name: string;
-  function_arguments: object | null;
-}
-
-
-export async function callChatGenerate(llmId: DLLMId, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  return await vendor.callChatGenerate(llm, messages, maxTokens);
-}
-
-export async function callChatGenerateWithFunctions(llmId: DLLMId, messages: VChatMessageIn[], functions: VChatFunctionIn[], forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  return await vendor.callChatGenerateWF(llm, messages, functions, forceFunctionName, maxTokens);
-}
@@ -1,13 +1,12 @@
 import type React from 'react';
+import type { TRPCClientErrorBase } from '@trpc/client';

-import type { DLLM, DModelSourceId } from '../store-llms';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../transports/chatGenerate';
+import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
+import type { ModelDescriptionSchema } from '../server/llm.server.types';
+import type { ModelVendorId } from './vendors.registry';
+import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '~/modules/llms/llm.client';


-export type ModelVendorId = 'anthropic' | 'azure' | 'localai' | 'mistral' | 'ollama' | 'oobabooga' | 'openai' | 'openrouter';
-
-export type ModelVendorRegistryType = Record<ModelVendorId, IModelVendor>;
-
 export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
  readonly id: ModelVendorId;
  readonly name: string;
@@ -30,7 +29,28 @@ export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOpt

  getTransportAccess(setup?: Partial<TSourceSetup>): TAccess;

-  callChatGenerate(llm: TDLLM, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut>;
+  rpcUpdateModelsQuery: (
+    access: TAccess,
+    enabled: boolean,
+    onSuccess: (data: { models: ModelDescriptionSchema[] }) => void,
+  ) => { isFetching: boolean, refetch: () => void, isError: boolean, error: TRPCClientErrorBase<any> | null };

-  callChatGenerateWF(llm: TDLLM, messages: VChatMessageIn[], functions: null | VChatFunctionIn[], forceFunctionName: null | string, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut>;
-}
+  rpcChatGenerateOrThrow: (
+    access: TAccess,
+    llmOptions: TLLMOptions,
+    messages: VChatMessageIn[],
+    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+    maxTokens?: number,
+  ) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
+
+  streamingChatGenerateOrThrow: (
+    access: TAccess,
+    llmId: DLLMId,
+    llmOptions: TLLMOptions,
+    messages: VChatMessageIn[],
+    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+    abortSignal: AbortSignal,
+    onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
+  ) => Promise<void>;
+
+}
@@ -7,11 +7,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidAnthropicApiKey, ModelVendorAnthropic } from './anthropic.vendor';

@@ -34,14 +34,8 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = anthropicKey ? keyValid : (!needsUserKey || !!anthropicHost);

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmAnthropic.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorAnthropic, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -1,11 +1,12 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { AnthropicIcon } from '~/common/components/icons/AnthropicIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

+import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
 import type { IModelVendor } from '../IModelVendor';
-import type { AnthropicAccessSchema } from '../../transports/server/anthropic/anthropic.router';
-import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { LLMOptionsOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
@@ -14,7 +15,7 @@ import { AnthropicSourceSetup } from './AnthropicSourceSetup';


 // special symbols
-export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length > 40 : apiKey.length >= 40);
+export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length >= 39 : apiKey.length >= 40);

 export interface SourceSetupAnthropic {
  anthropicKey: string;
@@ -42,37 +43,42 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicA
    anthropicHost: partialSetup?.anthropicHost || null,
    heliconeKey: partialSetup?.heliconeKey || null,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return anthropicCallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, /*null, null,*/ maxTokens);
+
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmAnthropic.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
-    throw new Error('Anthropic does not support "Functions" yet');
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Anthropic does not support functions');
+
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmAnthropic.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
+      console.error(`anthropic.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or function calls, or throws a descriptive error string
- */
-async function anthropicCallChatGenerate<TOut = VChatMessageOut>(
-  access: AnthropicAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  // functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmAnthropic.chatGenerate.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
-    console.error(`anthropicCallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -5,11 +5,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { asValidURL } from '~/common/util/urlUtils';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidAzureApiKey, ModelVendorAzure } from './azure.vendor';

@@ -31,14 +31,8 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = azureKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorAzure, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
 import { AzureIcon } from '~/common/components/icons/AzureIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { AzureSourceSetup } from './AzureSourceSetup';
@@ -58,10 +57,9 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, OpenAIAccessSchema
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport ('azure' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -0,0 +1,96 @@
+import * as React from 'react';
+
+import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
+import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';
+
+import { FormInputKey } from '~/common/components/forms/FormInputKey';
+import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
+import { InlineError } from '~/common/components/InlineError';
+import { Link } from '~/common/components/Link';
+import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
+
+import type { DModelSourceId } from '../../store-llms';
+import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';
+
+import { ModelVendorGemini } from './gemini.vendor';
+
+
+const GEMINI_API_KEY_LINK = 'https://makersuite.google.com/app/apikey';
+
+const SAFETY_OPTIONS: { value: GeminiBlockSafetyLevel, label: string }[] = [
+  { value: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED', label: 'Default' },
+  { value: 'BLOCK_LOW_AND_ABOVE', label: 'Low and above' },
+  { value: 'BLOCK_MEDIUM_AND_ABOVE', label: 'Medium and above' },
+  { value: 'BLOCK_ONLY_HIGH', label: 'Only high' },
+  { value: 'BLOCK_NONE', label: 'None' },
+];
+
+
+export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
+
+  // external state
+  const { source, sourceSetupValid, access, updateSetup } =
+    useSourceSetup(props.sourceId, ModelVendorGemini);
+
+  // derived state
+  const { geminiKey, minSafetyLevel } = access;
+
+  const needsUserKey = !ModelVendorGemini.hasBackendCap?.();
+  const shallFetchSucceed = !needsUserKey || (!!geminiKey && sourceSetupValid);
+  const showKeyError = !!geminiKey && !sourceSetupValid;
+
+  // fetch models
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorGemini, access, shallFetchSucceed, source);
+
+  return <>
+
+    <FormInputKey
+      id='gemini-key' label='Gemini API Key'
+      rightLabel={<>{needsUserKey
+        ? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
+        : '✔️ already set in server'}
+      </>}
+      value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
+      required={needsUserKey} isError={showKeyError}
+      placeholder='...'
+    />
+
+    <FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
+      <FormLabelStart title='Safety Settings'
+                      description='Threshold' />
+      <Select
+        variant='outlined'
+        value={minSafetyLevel} onChange={(_event, value) => value && updateSetup({ minSafetyLevel: value })}
+        startDecorator={<HealthAndSafetyIcon sx={{ display: { xs: 'none', sm: 'inherit' } }} />}
+        // indicator={<KeyboardArrowDownIcon />}
+        slotProps={{
+          root: { sx: { width: '100%' } },
+          indicator: { sx: { opacity: 0.5 } },
+          button: { sx: { whiteSpace: 'inherit' } },
+        }}
+      >
+        {SAFETY_OPTIONS.map(option => (
+          <Option key={'gemini-safety-' + option.value} value={option.value}>{option.label}</Option>
+        ))}
+      </Select>
+    </FormControl>
+
+    <FormHelperText sx={{ display: 'block' }}>
+      Gemini has <Link href='https://ai.google.dev/docs/safety_setting_gemini' target='_blank' noLinkStyle>
+      adjustable safety settings</Link> on four categories: Harassment, Hate speech,
+      Sexually explicit, and Dangerous content, in addition to non-adjustable built-in filters.
+      By default, the model will block content with <em>medium and above</em> probability
+      of being unsafe.
+    </FormHelperText>
+
+    <SetupFormRefetchButton
+      refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
+    />
+
+    {isError && <InlineError error={error} />}
+
+  </>;
+}
@@ -0,0 +1,97 @@
+import GoogleIcon from '@mui/icons-material/Google';
+
+import { backendCaps } from '~/modules/backend/state-backend';
+
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';
+
+import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
+import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
+import type { IModelVendor } from '../IModelVendor';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';
+
+import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
+
+import { GeminiSourceSetup } from './GeminiSourceSetup';
+
+
+export interface SourceSetupGemini {
+  geminiKey: string;
+  minSafetyLevel: GeminiBlockSafetyLevel;
+}
+
+export interface LLMOptionsGemini {
+  llmRef: string;
+  stopSequences: string[];  // up to 5 sequences that will stop generation (optional)
+  candidateCount: number;   // 1...8 number of generated responses to return (optional)
+  maxOutputTokens: number;  // if unset, this will default to outputTokenLimit (optional)
+  temperature: number;      // 0...1 Controls the randomness of the output. (optional)
+  topP: number;             // 0...1 The maximum cumulative probability of tokens to consider when sampling (optional)
+  topK: number;             // 1...100 The maximum number of tokens to consider when sampling (optional)
+}
+
+
+export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSchema, LLMOptionsGemini> = {
+  id: 'googleai',
+  name: 'Gemini',
+  rank: 11,
+  location: 'cloud',
+  instanceLimit: 1,
+  hasBackendCap: () => backendCaps().hasLlmGemini,
+
+  // components
+  Icon: GoogleIcon,
+  SourceSetupComponent: GeminiSourceSetup,
+  LLMOptionsComponent: OpenAILLMOptions,
+
+  // functions
+  initializeSetup: () => ({
+    geminiKey: '',
+    minSafetyLevel: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  }),
+  validateSetup: (setup) => {
+    return setup.geminiKey?.length > 0;
+  },
+  getTransportAccess: (partialSetup): GeminiAccessSchema => ({
+    dialect: 'gemini',
+    geminiKey: partialSetup?.geminiKey || '',
+    minSafetyLevel: partialSetup?.minSafetyLevel || 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  }),
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmGemini.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
+  },
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Gemini does not support functions');
+
+    const { llmRef, temperature = 0.5, maxOutputTokens } = llmOptions;
+    try {
+      return await apiAsync.llmGemini.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: temperature,
+          maxTokens: maxTokens || maxOutputTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
+      console.error(`gemini.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
+  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
+};
@@ -7,10 +7,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { ModelVendorLocalAI } from './localai.vendor';

@@ -30,14 +30,8 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = isValidHost;

  // fetch models - the OpenAI way
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: false, // !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorLocalAI, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);

  return <>

@@ -1,10 +1,9 @@
 import DevicesIcon from '@mui/icons-material/Devices';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { LocalAISourceSetup } from './LocalAISourceSetup';
@@ -38,10 +37,9 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, OpenAIAccessSc
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
-};
+
+  // OpenAI transport ('localai' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
+};
@@ -4,10 +4,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { ModelVendorMistral } from './mistral.vendor';

@@ -18,7 +18,7 @@ const MISTRAL_REG_LINK = 'https://console.mistral.ai/';
 export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
-  const { source, sourceSetupValid, sourceHasLLMs, access, updateSetup } =
+  const { source, sourceSetupValid, access, updateSetup } =
    useSourceSetup(props.sourceId, ModelVendorMistral);

  // derived state
@@ -29,14 +29,8 @@ export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
  const showKeyError = !!mistralKey && !sourceSetupValid;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: false,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorMistral, access, shallFetchSucceed, source);

  return <>

@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
 import { MistralIcon } from '~/common/components/icons/MistralIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatMessageIn, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate, SourceSetupOpenAI } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI, SourceSetupOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { MistralSourceSetup } from './MistralSourceSetup';
@@ -48,10 +47,9 @@ export const ModelVendorMistral: IModelVendor<SourceSetupMistral, OpenAIAccessSc
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF() {
-    throw new Error('Mistral does not support "Functions" yet');
-  },
+
+  // OpenAI transport ('mistral' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -12,7 +12,7 @@ import { Link } from '~/common/components/Link';
 import { apiQuery } from '~/common/util/trpc.client';
 import { settingsGap } from '~/common/app.theme';

-import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
+import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';


 export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {
@@ -68,7 +68,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
              >
                {pullable.map(p =>
                  <Option key={p.id} value={p.id}>
-                    {p.isNew === true && <Chip size='sm' variant='outlined'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
+                    {p.isNew === true && <Chip size='sm' variant='solid'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
                  </Option>,
                )}
              </Select>
@@ -118,7 +118,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
            {pullModelDescription}
          </Typography>

-          <Box sx={{ display: 'flex', flexWrap: 1, gap: 1 }}>
+          <Box sx={{ display: 'flex', flexWrap: 1, gap: 1, alignItems: 'start' }}>
            <Button
              variant='outlined'
              color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
@@ -6,13 +6,14 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { asValidURL } from '~/common/util/urlUtils';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';
+
 import { ModelVendorOllama } from './ollama.vendor';
 import { OllamaAdministration } from './OllamaAdministration';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';


 export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
@@ -32,14 +33,8 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = !hostError;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOllama.listModels.useQuery({ access }, {
-    enabled: false, // !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOllama, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);

  return <>

@@ -1,13 +1,14 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { OllamaIcon } from '~/common/components/icons/OllamaIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
-import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
-import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
+import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

-import { LLMOptionsOpenAI } from '../openai/openai.vendor';
+import type { LLMOptionsOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OllamaSourceSetup } from './OllamaSourceSetup';
@@ -36,36 +37,41 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSche
    dialect: 'ollama',
    ollamaHost: partialSetup?.ollamaHost || '',
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return ollamaCallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, maxTokens);
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmOllama.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
-    throw new Error('Ollama does not support "Functions" yet');
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Ollama does not support functions');
+
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmOllama.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
+      console.error(`ollama.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or throws a descriptive error string
- */
-async function ollamaCallChatGenerate<TOut = VChatMessageOut>(
-  access: OllamaAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmOllama.chatGenerate.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
-    console.error(`ollamaCallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -6,10 +6,10 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { ModelVendorOoobabooga } from './oobabooga.vendor';

@@ -24,14 +24,8 @@ export function OobaboogaSourceSetup(props: { sourceId: DModelSourceId }) {
  const { oaiHost } = access;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: false, // !hasModels && !!asValidURL(normSetup.oaiHost),
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOoobabooga, access, false /* !hasModels && !!asValidURL(normSetup.oaiHost) */, source);

  return <>

@@ -1,10 +1,9 @@
 import { OobaboogaIcon } from '~/common/components/icons/OobaboogaIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OobaboogaSourceSetup } from './OobaboogaSourceSetup';
@@ -38,10 +37,9 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, OpenAIAcc
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport (oobabooga dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -9,13 +9,13 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';

-import type { ModelDescriptionSchema } from '../../transports/server/server.schemas';
-import { DLLM, DModelSource, DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

-import { isValidOpenAIApiKey, LLMOptionsOpenAI, ModelVendorOpenAI } from './openai.vendor';
+import { isValidOpenAIApiKey, ModelVendorOpenAI } from './openai.vendor';


 // avoid repeating it all over
@@ -40,15 +40,8 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
-
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOpenAI, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -110,30 +103,3 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {

  </>;
 }
-
-
-export function modelDescriptionToDLLM<TSourceSetup>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, LLMOptionsOpenAI> {
-  const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
-  const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
-  return {
-    id: `${source.id}-${model.id}`,
-
-    label: model.label,
-    created: model.created || 0,
-    updated: model.updated || 0,
-    description: model.description,
-    tags: [], // ['stream', 'chat'],
-    contextTokens: model.contextWindow,
-    maxOutputTokens: maxOutputTokens,
-    hidden: !!model.hidden,
-
-    sId: source.id,
-    _source: source,
-
-    options: {
-      llmRef: model.id,
-      llmTemperature: 0.5,
-      llmResponseTokens: llmResponseTokens,
-    },
-  };
-}
@@ -1,11 +1,12 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { OpenAIIcon } from '~/common/components/icons/OpenAIIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
+import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { OpenAILLMOptions } from './OpenAILLMOptions';
 import { OpenAISourceSetup } from './OpenAISourceSetup';
@@ -51,41 +52,40 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSche
    moderationCheck: false,
    ...partialSetup,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    const access = this.getTransportAccess(llm._source.setup);
-    return openAICallChatGenerate(access, llm.options, messages, null, null, maxTokens);
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmOpenAI.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    const access = this.getTransportAccess(llm._source.setup);
-    return openAICallChatGenerate(access, llm.options, messages, functions, forceFunctionName, maxTokens);
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        functions: functions ?? undefined,
+        forceFunctionName: forceFunctionName ?? undefined,
+        history: messages,
+      }) as VChatMessageOrFunctionCallOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
+      console.error(`openai.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or function calls, or throws a descriptive error string
- */
-export async function openAICallChatGenerate<TOut = VChatMessageOut | VChatMessageOrFunctionCallOut>(
-  access: OpenAIAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      functions: functions ?? undefined,
-      forceFunctionName: forceFunctionName ?? undefined,
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
-    console.error(`openAICallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -6,11 +6,11 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { getCallbackUrl } from '~/common/app.routes';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidOpenRouterKey, ModelVendorOpenRouter } from './openrouter.vendor';

@@ -30,14 +30,8 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOpenRouter, access, !sourceHasLLMs && shallFetchSucceed, source);


  const handleOpenRouterLogin = () => {
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
 import { OpenRouterIcon } from '~/common/components/icons/OpenRouterIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OpenRouterSourceSetup } from './OpenRouterSourceSetup';
@@ -59,10 +58,9 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, OpenAIAc
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport ('openrouter' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -1,11 +1,10 @@
 import { apiAsync } from '~/common/util/trpc.client';

-import type { DLLM, DLLMId } from '../store-llms';
-import { findVendorForLlmOrThrow } from '../vendors/vendors.registry';
+import type { ChatStreamingFirstOutputPacketSchema, ChatStreamingInputSchema } from '../server/llm.server.streaming';
+import type { DLLMId } from '../store-llms';
+import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';

-import type { ChatStreamFirstPacketSchema, ChatStreamInputSchema } from './server/openai/openai.streaming';
-import type { OpenAIWire } from './server/openai/openai.wiretypes';
-import type { VChatMessageIn } from './chatGenerate';
+import type { OpenAIWire } from '../server/openai/openai.wiretypes';


 /**
@@ -15,27 +14,14 @@ import type { VChatMessageIn } from './chatGenerate';
 * Vendor-specific implementation is on our server backend (API) code. This function tries to be
 * as generic as possible.
 *
- * @param llmId LLM to use
- * @param messages the history of messages to send to the API endpoint
- * @param abortSignal used to initiate a client-side abort of the fetch request to the API endpoint
- * @param onUpdate callback when a piece of a message (text, model name, typing..) is received
+ * NOTE: onUpdate is callback when a piece of a message (text, model name, typing..) is received
 */
-export async function streamChat(
+export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions = unknown>(
+  access: ChatStreamingInputSchema['access'],
  llmId: DLLMId,
+  llmOptions: TLLMOptions,
  messages: VChatMessageIn[],
-  abortSignal: AbortSignal,
-  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
-): Promise<void> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  const access = vendor.getTransportAccess(llm._source.setup) as ChatStreamInputSchema['access'];
-  return await vendorStreamChat(access, llm, messages, abortSignal, onUpdate);
-}
-
-
-async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
-  access: ChatStreamInputSchema['access'],
-  llm: DLLM<TSourceSetup, TLLMOptions>,
-  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
  abortSignal: AbortSignal,
  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
 ) {
@@ -79,12 +65,12 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
  }

  // model params (llm)
-  const { llmRef, llmTemperature, llmResponseTokens } = (llm.options as any) || {};
+  const { llmRef, llmTemperature, llmResponseTokens } = (llmOptions as any) || {};
  if (!llmRef || llmTemperature === undefined || llmResponseTokens === undefined)
-    throw new Error(`Error in configuration for model ${llm.id}: ${JSON.stringify(llm.options)}`);
+    throw new Error(`Error in configuration for model ${llmId}: ${JSON.stringify(llmOptions)}`);

  // prepare the input, similarly to the tRPC openAI.chatGenerate
-  const input: ChatStreamInputSchema = {
+  const input: ChatStreamingInputSchema = {
    access,
    model: {
      id: llmRef,
@@ -131,7 +117,7 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
      incrementalText = incrementalText.substring(endOfJson + 1);
      parsedFirstPacket = true;
      try {
-        const parsed: ChatStreamFirstPacketSchema = JSON.parse(json);
+        const parsed: ChatStreamingFirstOutputPacketSchema = JSON.parse(json);
        onUpdate({ originLLM: parsed.model }, false);
      } catch (e) {
        // error parsing JSON, ignore
@@ -0,0 +1,47 @@
+import type { IModelVendor } from './IModelVendor';
+import type { ModelDescriptionSchema } from '../server/llm.server.types';
+import { DLLM, DModelSource, useModelsStore } from '../store-llms';
+
+
+/**
+ * Hook that fetches the list of models from the vendor and updates the store,
+ * while returning the fetch state.
+ */
+export function useLlmUpdateModels<TSourceSetup, TAccess, TLLMOptions>(vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>, access: TAccess, enabled: boolean, source: DModelSource<TSourceSetup>) {
+  return vendor.rpcUpdateModelsQuery(access, enabled, data => source && updateModelsFn(data, source));
+}
+
+
+function updateModelsFn<TSourceSetup>(data: { models: ModelDescriptionSchema[] }, source: DModelSource<TSourceSetup>) {
+  useModelsStore.getState().setLLMs(
+    data.models.map(model => modelDescriptionToDLLMOpenAIOptions(model, source)),
+    source.id,
+  );
+}
+
+function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, TLLMOptions> {
+  const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
+  const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
+  return {
+    id: `${source.id}-${model.id}`,
+
+    label: model.label,
+    created: model.created || 0,
+    updated: model.updated || 0,
+    description: model.description,
+    tags: [], // ['stream', 'chat'],
+    contextTokens: model.contextWindow,
+    maxOutputTokens: maxOutputTokens,
+    hidden: !!model.hidden,
+
+    sId: source.id,
+    _source: source,
+
+    options: {
+      llmRef: model.id,
+      // @ts-ignore FIXME: large assumption that this is LLMOptionsOpenAI object
+      llmTemperature: 0.5,
+      llmResponseTokens: llmResponseTokens,
+    },
+  };
+}
@@ -0,0 +1,35 @@
+import { shallow } from 'zustand/shallow';
+
+import type { IModelVendor } from './IModelVendor';
+import { DModelSource, DModelSourceId, useModelsStore } from '../store-llms';
+
+
+/**
+ * Source-specific read/write - great time saver
+ */
+export function useSourceSetup<TSourceSetup, TAccess, TLLMOptions>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>) {
+
+  // invalidates only when the setup changes
+  const { updateSourceSetup, ...rest } = useModelsStore(state => {
+
+    // find the source (or null)
+    const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
+
+    // (safe) source-derived properties
+    const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
+    const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
+    const access = vendor.getTransportAccess(source?.setup);
+
+    return {
+      source,
+      access,
+      sourceHasLLMs: !!sourceLLMs.length,
+      sourceSetupValid,
+      updateSourceSetup: state.updateSourceSetup,
+    };
+  }, shallow);
+
+  // convenience function for this source
+  const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
+  return { ...rest, updateSetup };
+}
@@ -1,5 +1,6 @@
 import { ModelVendorAnthropic } from './anthropic/anthropic.vendor';
 import { ModelVendorAzure } from './azure/azure.vendor';
+import { ModelVendorGemini } from './gemini/gemini.vendor';
 import { ModelVendorLocalAI } from './localai/localai.vendor';
 import { ModelVendorMistral } from './mistral/mistral.vendor';
 import { ModelVendorOllama } from './ollama/ollama.vendor';
@@ -7,20 +8,32 @@ import { ModelVendorOoobabooga } from './oobabooga/oobabooga.vendor';
 import { ModelVendorOpenAI } from './openai/openai.vendor';
 import { ModelVendorOpenRouter } from './openrouter/openrouter.vendor';

-import type { IModelVendor, ModelVendorId, ModelVendorRegistryType } from './IModelVendor';
+import type { IModelVendor } from './IModelVendor';
 import { DLLMId, DModelSource, DModelSourceId, findLLMOrThrow } from '../store-llms';

+export type ModelVendorId =
+  | 'anthropic'
+  | 'azure'
+  | 'googleai'
+  | 'localai'
+  | 'mistral'
+  | 'ollama'
+  | 'oobabooga'
+  | 'openai'
+  | 'openrouter';
+
 /** Global: Vendor Instances Registry **/
-const MODEL_VENDOR_REGISTRY: ModelVendorRegistryType = {
+const MODEL_VENDOR_REGISTRY: Record<ModelVendorId, IModelVendor> = {
  anthropic: ModelVendorAnthropic,
  azure: ModelVendorAzure,
+  googleai: ModelVendorGemini,
  localai: ModelVendorLocalAI,
  mistral: ModelVendorMistral,
  ollama: ModelVendorOllama,
  oobabooga: ModelVendorOoobabooga,
  openai: ModelVendorOpenAI,
  openrouter: ModelVendorOpenRouter,
-};
+} as Record<string, IModelVendor>;

 const MODEL_VENDOR_DEFAULT: ModelVendorId = 'openai';

@@ -31,13 +44,15 @@ export function findAllVendors(): IModelVendor[] {
  return modelVendors;
 }

-export function findVendorById(vendorId?: ModelVendorId): IModelVendor | null {
-  return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] ?? null) : null;
+export function findVendorById<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  vendorId?: ModelVendorId,
+): IModelVendor<TSourceSetup, TAccess, TLLMOptions> | null {
+  return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] as IModelVendor<TSourceSetup, TAccess, TLLMOptions>) ?? null : null;
 }

-export function findVendorForLlmOrThrow(llmId: DLLMId) {
-  const llm = findLLMOrThrow(llmId);
-  const vendor = findVendorById(llm?._source.vId);
+export function findVendorForLlmOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(llmId: DLLMId) {
+  const llm = findLLMOrThrow<TSourceSetup, TLLMOptions>(llmId);
+  const vendor = findVendorById<TSourceSetup, TAccess, TLLMOptions>(llm?._source.vId);
  if (!vendor) throw new Error(`callChat: Vendor not found for LLM ${llmId}`);
  return { llm, vendor };
 }
@@ -3,9 +3,10 @@ import { createTRPCRouter } from './trpc.server';
 import { backendRouter } from '~/modules/backend/backend.router';
 import { elevenlabsRouter } from '~/modules/elevenlabs/elevenlabs.router';
 import { googleSearchRouter } from '~/modules/google/search.router';
-import { llmAnthropicRouter } from '~/modules/llms/transports/server/anthropic/anthropic.router';
-import { llmOllamaRouter } from '~/modules/llms/transports/server/ollama/ollama.router';
-import { llmOpenAIRouter } from '~/modules/llms/transports/server/openai/openai.router';
+import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.router';
+import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
+import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
+import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
 import { prodiaRouter } from '~/modules/prodia/prodia.router';
 import { ytPersonaRouter } from '../../apps/personas/ytpersona.router';

@@ -17,6 +18,7 @@ export const appRouterEdge = createTRPCRouter({
  elevenlabs: elevenlabsRouter,
  googleSearch: googleSearchRouter,
  llmAnthropic: llmAnthropicRouter,
+  llmGemini: llmGeminiRouter,
  llmOllama: llmOllamaRouter,
  llmOpenAI: llmOpenAIRouter,
  prodia: prodiaRouter,
@@ -5,8 +5,8 @@ export const env = createEnv({
  server: {

    // Backend Postgres, for optional storage via Prisma
-    POSTGRES_PRISMA_URL: z.string().url().optional(),
-    POSTGRES_URL_NON_POOLING: z.string().url().optional(),
+    POSTGRES_PRISMA_URL: z.string().optional(),
+    POSTGRES_URL_NON_POOLING: z.string().optional(),

    // LLM: OpenAI
    OPENAI_API_KEY: z.string().optional(),
@@ -21,6 +21,9 @@ export const env = createEnv({
    ANTHROPIC_API_KEY: z.string().optional(),
    ANTHROPIC_API_HOST: z.string().url().optional(),

+    // LLM: Google AI's Gemini
+    GEMINI_API_KEY: z.string().optional(),
+
    // LLM: Mistral
    MISTRAL_API_KEY: z.string().optional(),

@@ -62,6 +65,9 @@ export const env = createEnv({
    throw new Error('Invalid environment variable');
  },

+  // matches user expectations - see https://github.com/enricoros/big-AGI/issues/279
+  emptyStringAsUndefined: true,
+
  // with Noext.JS >= 13.4.4 we'd only need to destructure client variables
  experimental__runtimeEnv: {},
 });
Author	SHA1	Message	Date
Enrico Ros	0fc83cf6f5	Merge branch 'release-1.8.0'	2023-12-20 02:38:51 -08:00
Enrico Ros	2949feccd5	Maintainers Release	2023-12-20 02:32:47 -08:00
Enrico Ros	d6f1c2da81	1.8.0: Readme and Changelog	2023-12-20 02:11:13 -08:00
Enrico Ros	fabb433fde	1.8.0: news.data.tsx	2023-12-20 01:54:23 -08:00
Enrico Ros	b57445eb14	1.8.0: Version	2023-12-20 01:11:08 -08:00
Enrico Ros	5f8f4aba78	Ollama: update models	2023-12-20 00:59:14 -08:00
Enrico Ros	d693cdaeba	Ollama: update admin panel	2023-12-20 00:59:03 -08:00
Enrico Ros	39fbcfd97b	OpenRouter: update models	2023-12-20 00:55:27 -08:00
Enrico Ros	7694bc3d52	OpenRouter: update models	2023-12-20 00:53:16 -08:00
Enrico Ros	7f21b2ac3d	Merge branch 'feature-gemini' Fixes #275	2023-12-20 00:16:44 -08:00
Enrico Ros	fdb66da1a7	Gemini: choose a content filtering threshold	2023-12-20 00:14:53 -08:00
Enrico Ros	6b62a6733b	Gemini: show block reason	2023-12-20 00:14:53 -08:00
Enrico Ros	5d62056807	Streaming: muxing format	2023-12-20 00:14:53 -08:00
Enrico Ros	efff7126af	Gemini: final touches	2023-12-20 00:14:53 -08:00
Enrico Ros	45046c70ed	Gemini: stream on	2023-12-20 00:14:53 -08:00
Enrico Ros	7b5b852793	Gemini: trim key	2023-12-20 00:14:53 -08:00
Enrico Ros	9952b757b8	Gemini: client version	2023-12-20 00:14:53 -08:00
Enrico Ros	b08ecc9012	Models Modal: improve caps	2023-12-20 00:14:53 -08:00
Enrico Ros	bc5a38fa89	Models List: show a helpful message	2023-12-20 00:14:53 -08:00
Enrico Ros	bee49a4b1c	Llms: streaming as a vendor function (then all directed to the unified)	2023-12-20 00:14:53 -08:00
Enrico Ros	0ece1ce58c	Llms: vendor-specific RPC to ChatGenerate	2023-12-20 00:14:53 -08:00
Enrico Ros	fd897b55b2	Llms: improve list generics	2023-12-20 00:14:53 -08:00
Enrico Ros	dd41a402d0	Llms: move models modal	2023-12-20 00:14:53 -08:00
Enrico Ros	3f9defd18c	Llms: restructure	2023-12-20 00:14:53 -08:00
Enrico Ros	49c77f5a10	Llms: cleanup model lists (bits)	2023-12-20 00:14:52 -08:00
Enrico Ros	6b2bfa6060	Llms: cleanup model lists	2023-12-20 00:14:52 -08:00
Enrico Ros	8e3f247bfb	Gemini: cleaner	2023-12-20 00:14:52 -08:00
Enrico Ros	201e3a7252	Streaming: cleanup	2023-12-20 00:14:52 -08:00
Enrico Ros	044ed4df79	Bits for the future	2023-12-20 00:14:52 -08:00
Enrico Ros	0df7297cca	Gemini: configuration, list models, and immediate generation	2023-12-20 00:14:52 -08:00
Enrico Ros	453a3e5751	LLM Vendors: auto IDs	2023-12-20 00:14:52 -08:00
Enrico Ros	34c1c425b9	Gemini: backend env var	2023-12-20 00:14:52 -08:00
Enrico Ros	e0a010189f	LLMOptions Modal: fix display	2023-12-20 00:14:52 -08:00
Enrico Ros	7a07f10ed1	Move ModelVendor enum	2023-12-20 00:14:52 -08:00
Enrico Ros	33cb2b84b2	Anthropic: allow for 39 chars sks	2023-12-20 00:13:58 -08:00
Enrico Ros	3adec85e1f	Fix shortcuts on Mac.	2023-12-18 19:59:03 -08:00
Enrico Ros	18cfe5e296	DB: drop URL validation for POSTGRES_PRISMA_URL. #277	2023-12-18 15:16:02 -08:00
Enrico Ros	566ba366b4	Merge pull request #280 [Visualize] Add custom instruction #218	2023-12-18 12:19:03 -08:00
Enrico Ros	7ed653b315	Fix.	2023-12-18 04:54:04 -08:00
Enrico Ros	cb333c33d7	Better 1-click deployment, fixes #279	2023-12-18 03:22:18 -08:00
Joris Kalz	22ba37074b	[Visualize] Add custom instruction #218	2023-12-16 23:22:47 +01:00
Enrico Ros	84d7b7644a	Ollama: update models	2023-12-15 15:48:41 -08:00
Enrico Ros	71445dafc8	Ollama: improved diagram	2023-12-15 15:29:56 -08:00
Enrico Ros	66a5ad7f00	Ollama: update md	2023-12-15 15:27:11 -08:00
Enrico Ros	09f80adfaa	Ollama: update md	2023-12-15 15:26:38 -08:00
Enrico Ros	9febd97065	Ollama: update md	2023-12-15 15:24:48 -08:00
Enrico Ros	5219f9928d	Ollama: update md	2023-12-15 15:24:13 -08:00
Enrico Ros	aec9f4665f	Update config-ollama.md	2023-12-15 15:23:48 -08:00
Enrico Ros	db48465204	Ollama: document network issue resolution. #276	2023-12-15 15:20:33 -08:00
Enrico Ros	c2c858730a	Bite the bullet with Zustand	2023-12-13 14:57:06 -08:00
Enrico Ros	402bde9a81	Newpad	2023-12-13 02:06:19 -08:00
Enrico Ros	ba1c0ba0d9	Enforce a Single instance (Tab) of the app. Closes #268	2023-12-13 00:09:56 -08:00
Enrico Ros	084d77cd78	Linting	2023-12-12 18:24:59 -08:00
Enrico Ros	30c17a9b73	Roll Joy	2023-12-12 18:10:46 -08:00
Enrico Ros	2442463da3	deploy-docker.md: update Official guide	2023-12-12 17:52:28 -08:00
Enrico Ros	84a3e8cfdb	Fix docker-compose to point to the 'latest' (stable) version, instead of the no more existing 'main'	2023-12-12 17:17:30 -08:00