Merge branch 'release-1.8.0'

Maintainers Release
1.8.0: Readme and Changelog
2026-05-10 21:50:14 -07:00 · 2023-12-20 02:38:51 -08:00 · 2023-12-20 02:32:47 -08:00 · 2023-12-20 02:11:13 -08:00 · 2023-12-20 01:54:23 -08:00 · 2023-12-20 01:11:08 -08:00
99 changed files with 3055 additions and 1150 deletions
@@ -65,7 +65,11 @@ I need the following from you:

 ### GitHub release

-Now paste the former release (or 1.5.0 which was accurate and great), including the new contributors and
+```markdown
+Please create the 1.2.3 Release Notes for GitHub. The following were the Release Notes for 1.1.0. Use a truthful and honest tone, undestanding that people's time and attention span is short. Today is 2023-12-20.
+```
+
+Now paste-attachment the former release notes (or 1.5.0 which was accurate and great), including the new contributors and
 some stats (# of commits, etc.), and roll it for the new release.

 ### Discord announcement
@@ -13,7 +13,7 @@ on:
  push:
    branches:
      - main
-      - main-stable  # Trigger on pushes to the main-stable branch
+      #- main-stable  # Disabled as the v* tag is used for stable releases
    tags:
      - 'v*'  # Trigger on version tags (e.g., v1.7.0)

@@ -1,8 +1,8 @@
 # BIG-AGI 🧠✨

-Welcome to big-AGI 👋, the GPT application for professionals that need form, function,
-simplicity, and speed. Powered by the latest models from 7 vendors, including
-open-source, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
+Welcome to big-AGI 👋, the GPT application for professionals that need function, form,
+simplicity, and speed. Powered by the latest models from 8 vendors and
+open-source model servers, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
 visualizations, coding, drawing, calling, and quite more -- all in a polished UX.

 Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
@@ -11,7 +11,7 @@ Pros use big-AGI. 🚀 Developers love big-AGI. 🤖

 Or fork & run on Vercel

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
+[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)

 ## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2)

@@ -21,7 +21,19 @@ shows the current developments and future ideas.
 - Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
 - Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_

-### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
+### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
+
+- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
+- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
+- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
+- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
+- Mac Shortcuts Fix: Improved UX on Mac
+- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
+- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
+- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
+- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
+
+### What's New in 1.7.0 · Dec 11, 2023

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -145,7 +157,7 @@ Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare

 Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
+[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)

 ## Integrations:

@@ -1,2 +1,2 @@
 export const runtime = 'edge';
-export { openaiStreamingRelayHandler as POST } from '~/modules/llms/transports/server/openai/openai.streaming';
+export { llmStreamingRelayHandler as POST } from '~/modules/llms/server/llm.server.streaming';
@@ -6,7 +6,7 @@ version: '3.9'

 services:
  big-agi:
-    image: ghcr.io/enricoros/big-agi:main
+    image: ghcr.io/enricoros/big-agi:latest
    ports:
      - "3000:3000"
    env_file:
@@ -5,12 +5,24 @@ by release.

 - For the live roadmap, please see [the GitHub project](https://github.com/users/enricoros/projects/4/views/2)

-### 1.8.0 - Dec 2023
+### 1.9.0 - Dec 2023

 - work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
+- milestone: [1.9.0](https://github.com/enricoros/big-agi/milestone/9)

-### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
+### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
+
+- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
+- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
+- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
+- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
+- Mac Shortcuts Fix: Improved UX on Mac
+- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
+- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
+- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
+- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
+
+### What's New in 1.7.0 · Dec 11, 2023 · Attachment Theory

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -30,5 +30,5 @@ For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/ba

 > NOTE: LocalAI does not list details about the mdoels. Every model is assumed to be
 > capable of chatting, and with a context window of 4096 tokens.
-> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/transports/server/openai/models.data.ts)
+> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/server/openai/models.data.ts)
 > file with the mapping information between LocalAI model IDs and names/descriptions/tokens, etc.
@@ -5,31 +5,46 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
 experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
 editing tools, models switching, personas, and more.

+_Last updated Dec 16, 2023_
+
 ![config-local-ollama-0-example.png](pixels/config-ollama-0-example.png)

 ## Quick Integration Guide

-1. **Ensure Ollama API Server is Running**: Before starting, make sure your Ollama API server is up and running.
-2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**.
-3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`).
-4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models.
-5. **Start Using AI Personas**: Select an Ollama model and begin interacting with AI personas tailored to your needs.
+1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
+   - For detailed instructions on setting up the Ollama API server, please refer to the
+   [Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md). 
+2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
+3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
+4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
+   > Optional: use the Ollama Admin interface to see which models are available and 'Pull' them in your local machine. Note
+   that this operation will likely timeout due to Edge Functions timeout on the big-AGI server while pulling, and
+   you'll have to press the 'Pull' button again, until a green message appears.
+5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas

-### Ollama: installation and Setup
+**Visual Configuration Guide**:

-For detailed instructions on setting up the Ollama API server, please refer to the
-[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
+* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:<br/>
+  <img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" width="320">

-### Visual Guide
+* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:<br/>
+  <img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" width="320">

-* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:
-  <img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" style="max-width: 320px;">
+* You can now switch model/persona dynamically and text/voice chat with the models:<br/>
+  <img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" width="320">

-* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:
-  <img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" style="max-width: 320px;">
+<br/>

-* You can now switch model/persona dynamically and text/voice chat with the models:
-  <img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" style="max-width: 320px;">
+### ⚠️ Network Troubleshooting
+
+If you get errors about the server having trouble connecting with Ollama, please see 
+[this message](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483) on Issue #276.
+
+And in brief, make sure the Ollama endpoint is accessible from the servers where you run big-AGI (which could
+be localhost or cloud servers).
+![Ollama Networking Chart](pixels/config-ollama-network.png)
+
+<br/>

 ### Advanced: Model parameters

@@ -68,6 +83,8 @@ Then, edit the nginx configuration file `/etc/nginx/sites-enabled/default` and a

 Reach out to our community if you need help with this.

+<br/>
+
 ### Community and Support

 Join our community to share your experiences, get help, and discuss best practices:
@@ -78,4 +95,4 @@ Join our community to share your experiences, get help, and discuss best practic
 ---

 `big-AGI` is committed to providing a powerful, intuitive, and privacy-respecting AI experience.
-We are excited for you to explore the possibilities with Ollama models. Happy creating!
+We are excited for you to explore the possibilities with Ollama models. Happy creating!
@@ -21,33 +21,23 @@ Docker ensures faster development cycles, easier collaboration, and seamless env
   ```
 4. Browse to [http://localhost:3000](http://localhost:3000)

-## Documentation
+<br/>

-The big-AGI repository includes a Dockerfile and a GitHub Actions workflow for building and publishing a
-Docker image of the application.
+## Run Official Containers 📦

-### Dockerfile
+`big-AGI` is pre-built from source code and published as a Docker image on the GitHub Container Registry (ghcr).
+The build process is transparent, and happens via GitHub Actions, as described in the
+file.

-The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
-installs dependencies, and creates a production-ready version of the application as a local container.
+### Official Images: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)

-### Official container images
+#### Run using *docker* 🚀

-The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file automates the
-building and publishing of the Docker images to the GitHub Container Registry (ghcr) when changes are
-pushed to the `main` branch.
-
-Official pre-built containers: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
-
-Run official pre-built containers:
 ```bash
-docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
+docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi:latest
 ```

-### Run official containers
-
-In addition, the repository also includes a `docker-compose.yaml` file, configured to run the pre-built
-'ghcr image'. This file is used to define the `big-agi` service, the ports to expose, and the command to run.
+#### Run using *docker-compose* 🚀

 If you have Docker Compose installed, you can run the Docker container with `docker-compose up`
 to pull the Docker image (if it hasn't been pulled already) and start a Docker container. If you want to
@@ -57,4 +47,31 @@ update the image to the latest version, you can run `docker-compose pull` before
 docker-compose up -d
 ```

-Leverage Docker's capabilities for a reliable and efficient big-AGI deployment.
+### Make Local Services Visible to Docker 🌐
+
+To make local services running on your host machine accessible to a Docker container, such as a
+[Browseless](./config-browse.md) service or a local API, you can follow this simplified guide:
+
+| Operating System  | Steps to Make Local Services Visible to Docker                                                                                                                                                                                                                                                                                                                                               |
+|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Windows and macOS | Use the special DNS name `host.docker.internal` to refer to the host machine from within the Docker container. No additional network configuration is required. Access local services using `host.docker.internal:<PORT>`.                                                                                                                                                                   |
+| Linux             | Two options: *A*. Use <ins>--network="host"</ins> (`docker run --network="host" -d big-agi`) when running the Docker container to merge the container within the host network stack; however, this reduces container isolation. Alternatively: *B*. Connect to local services <ins>using the host's IP address</ins> directly, as host.docker.internal is not available by default on Linux. |
+
+<br/>
+
+### More Information
+
+The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
+installs dependencies, and creates a production-ready version of the application as a local container.
+
+The [`docker-compose.yaml`](../docker-compose.yaml) file is configured to run the
+official image (big-agi:latest). This file is used to define the `big-agi` service, to expose
+port 3000 on the host, and launch big-AGI within the container (startup command).
+
+The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file is used
+to build the Official Docker images and publish them to the GitHub Container Registry (ghcr).
+The build process is transparent and happens via GitHub Actions.
+
+<br/>
+
+Leverage Docker's capabilities for a reliable and efficient big-AGI deployment!
@@ -12,7 +12,7 @@ version: '3.9'

 services:
  big-agi:
-    image: ghcr.io/enricoros/big-agi:main
+    image: ghcr.io/enricoros/big-agi:latest
    ports:
      - "3000:3000"
    env_file:
@@ -24,6 +24,8 @@ AZURE_OPENAI_API_ENDPOINT=
 AZURE_OPENAI_API_KEY=
 ANTHROPIC_API_KEY=
 ANTHROPIC_API_HOST=
+GEMINI_API_KEY=
+MISTRAL_API_KEY=
 OLLAMA_API_HOST=
 OPENROUTER_API_KEY=

@@ -45,7 +47,7 @@ PUPPETEER_WSS_ENDPOINT=
 # Backend Analytics
 BACKEND_ANALYTICS=

-# Backend HTTP Basic Authentication
+# Backend HTTP Basic Authentication (see `deploy-authentication.md` for turning on authentication)
 HTTP_BASIC_AUTH_USERNAME=
 HTTP_BASIC_AUTH_PASSWORD=
 ```
@@ -79,6 +81,8 @@ requiring the user to enter an API key
 | `AZURE_OPENAI_API_KEY`      | Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md)                                                    | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
 | `ANTHROPIC_API_KEY`         | The API key for Anthropic                                                                                                     | Optional                                                          |
 | `ANTHROPIC_API_HOST`        | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional                                                          |
+| `GEMINI_API_KEY`            | The API key for Google AI's Gemini                                                                                            | Optional                                                          |
+| `MISTRAL_API_KEY`           | The API key for Mistral                                                                                                       | Optional                                                          |
 | `OLLAMA_API_HOST`           | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md)                                      |                                                                   |
 | `OPENROUTER_API_KEY`        | The API key for OpenRouter                                                                                                    | Optional                                                          |

@@ -113,10 +117,7 @@ Enable the app to Talk, Draw, and Google things up.
 | `PUPPETEER_WSS_ENDPOINT`   | Puppeteer WebSocket endpoint - used for browsing, etc.                                                                  |
 | **Backend**                |                                                                                                                         | 
 | `BACKEND_ANALYTICS`        | Semicolon-separated list of analytics flags (see backend.analytics.ts). Flags: `domain` logs the responding domain.     |
-| `HTTP_BASIC_AUTH_USERNAME` | Username for HTTP Basic Authentication. See the [Authentication](deploy-authentication.md) guide.                       |
+| `HTTP_BASIC_AUTH_USERNAME` | See the [Authentication](deploy-authentication.md) guide. Username for HTTP Basic Authentication.                       |
 | `HTTP_BASIC_AUTH_PASSWORD` | Password for HTTP Basic Authentication.                                                                                 |

 ---
-
-
-
@@ -1,6 +1,6 @@
 {
  "name": "big-agi",
-  "version": "1.7.0",
+  "version": "1.8.0",
  "private": true,
  "scripts": {
    "dev": "next dev",
@@ -18,13 +18,13 @@
    "@emotion/react": "^11.11.1",
    "@emotion/server": "^11.11.0",
    "@emotion/styled": "^11.11.0",
-    "@mui/icons-material": "^5.14.18",
-    "@mui/joy": "^5.0.0-beta.15",
-    "@next/bundle-analyzer": "^14.0.3",
-    "@prisma/client": "^5.6.0",
+    "@mui/icons-material": "^5.15.0",
+    "@mui/joy": "^5.0.0-beta.18",
+    "@next/bundle-analyzer": "^14.0.4",
+    "@prisma/client": "^5.7.0",
    "@sanity/diff-match-patch": "^3.1.1",
    "@t3-oss/env-nextjs": "^0.7.1",
-    "@tanstack/react-query": "^4.36.1",
+    "@tanstack/react-query": "~4.36.1",
    "@trpc/client": "^10.44.1",
    "@trpc/next": "^10.44.1",
    "@trpc/react-query": "^10.44.1",
@@ -33,8 +33,8 @@
    "browser-fs-access": "^0.35.0",
    "eventsource-parser": "^1.1.1",
    "idb-keyval": "^6.2.1",
-    "next": "^14.0.3",
-    "pdfjs-dist": "4.0.189",
+    "next": "^14.0.4",
+    "pdfjs-dist": "4.0.269",
    "plantuml-encoder": "^1.4.0",
    "prismjs": "^1.29.0",
    "react": "^18.2.0",
@@ -47,23 +47,23 @@
    "tesseract.js": "^5.0.3",
    "uuid": "^9.0.1",
    "zod": "^3.22.4",
-    "zustand": "~4.3.9"
+    "zustand": "^4.4.7"
  },
  "devDependencies": {
    "@cloudflare/puppeteer": "^0.0.5",
-    "@types/node": "^20.10.0",
+    "@types/node": "^20.10.4",
    "@types/plantuml-encoder": "^1.4.2",
    "@types/prismjs": "^1.26.3",
-    "@types/react": "^18.2.38",
+    "@types/react": "^18.2.45",
    "@types/react-dom": "^18.2.17",
-    "@types/react-katex": "^3.0.3",
+    "@types/react-katex": "^3.0.4",
    "@types/react-timeago": "^4.1.6",
    "@types/uuid": "^9.0.7",
-    "eslint": "^8.54.0",
-    "eslint-config-next": "^14.0.3",
-    "prettier": "^3.1.0",
-    "prisma": "^5.6.0",
-    "typescript": "^5.3.2"
+    "eslint": "^8.55.0",
+    "eslint-config-next": "^14.0.4",
+    "prettier": "^3.1.1",
+    "prisma": "^5.7.0",
+    "typescript": "^5.3.3"
  },
  "engines": {
    "node": "^20.0.0 || ^18.0.0"
@@ -11,6 +11,7 @@ import '~/common/styles/CodePrism.css';
 import '~/common/styles/GithubMarkdown.css';

 import { ProviderBackend } from '~/common/state/ProviderBackend';
+import { ProviderSingleTab } from '~/common/state/ProviderSingleTab';
 import { ProviderSnacks } from '~/common/state/ProviderSnacks';
 import { ProviderTRPCQueryClient } from '~/common/state/ProviderTRPCQueryClient';
 import { ProviderTheming } from '~/common/state/ProviderTheming';
@@ -25,13 +26,15 @@ const MyApp = ({ Component, emotionCache, pageProps }: MyAppProps) =>
    </Head>

    <ProviderTheming emotionCache={emotionCache}>
-      <ProviderTRPCQueryClient>
-        <ProviderSnacks>
-          <ProviderBackend>
-            <Component {...pageProps} />
-          </ProviderBackend>
-        </ProviderSnacks>
-      </ProviderTRPCQueryClient>
+      <ProviderSingleTab>
+        <ProviderTRPCQueryClient>
+          <ProviderSnacks>
+            <ProviderBackend>
+              <Component {...pageProps} />
+            </ProviderBackend>
+          </ProviderSnacks>
+        </ProviderTRPCQueryClient>
+      </ProviderSingleTab>
    </ProviderTheming>

    <VercelAnalytics debug={false} />
@@ -0,0 +1,98 @@
+import * as React from 'react';
+import { useRouter } from 'next/router';
+
+import { Box, Typography } from '@mui/joy';
+
+import { useModelsStore } from '~/modules/llms/store-llms';
+
+import { AppLayout } from '~/common/layout/AppLayout';
+import { InlineError } from '~/common/components/InlineError';
+import { apiQuery } from '~/common/util/trpc.client';
+import { navigateToIndex } from '~/common/app.routes';
+import { openLayoutModelsSetup } from '~/common/layout/store-applayout';
+
+
+function CallbackOpenRouterPage(props: { openRouterCode: string | undefined }) {
+
+  // external state
+  const { data, isError, error, isLoading } = apiQuery.backend.exchangeOpenRouterKey.useQuery({ code: props.openRouterCode || '' }, {
+    enabled: !!props.openRouterCode,
+    refetchOnWindowFocus: false,
+    staleTime: Infinity,
+  });
+
+  // derived state
+  const isErrorInput = !props.openRouterCode;
+  const openRouterKey = data?.key ?? undefined;
+  const isSuccess = !!openRouterKey;
+
+
+  // Success: save the key and redirect to the chat app
+  React.useEffect(() => {
+    if (!isSuccess)
+      return;
+
+    // 1. Save the key as the client key
+    useModelsStore.getState().setOpenRoutersKey(openRouterKey);
+
+    // 2. Navigate to the chat app
+    navigateToIndex(true).then(() => openLayoutModelsSetup());
+
+  }, [isSuccess, openRouterKey]);
+
+  return (
+    <Box sx={{
+      flexGrow: 1,
+      backgroundColor: 'background.level1',
+      overflowY: 'auto',
+      display: 'flex', justifyContent: 'center',
+      p: { xs: 3, md: 6 },
+    }}>
+
+      <Box sx={{
+        // my: 'auto',
+        display: 'flex', flexDirection: 'column', alignItems: 'center',
+        gap: 4,
+      }}>
+
+        <Typography level='title-lg'>
+          Welcome Back
+        </Typography>
+
+        {isLoading && <Typography level='body-sm'>Loading...</Typography>}
+
+        {isErrorInput && <InlineError error='There was an issue retrieving the code from OpenRouter.' />}
+
+        {isError && <InlineError error={error} />}
+
+        {data && (
+          <Typography level='body-md'>
+            Success! You can now close this window.
+          </Typography>
+        )}
+
+      </Box>
+
+    </Box>
+  );
+}
+
+
+/**
+ * This page will be invoked by OpenRouter as a Callback
+ *
+ * Docs: https://openrouter.ai/docs#oauth
+ * Example URL: https://localhost:3000/link/callback_openrouter?code=SomeCode
+ */
+export default function Page() {
+
+  // get the 'code=...' from the URL
+  const { query } = useRouter();
+  const { code: openRouterCode } = query;
+
+  return (
+    <AppLayout suspendAutoModelsSetup>
+      <CallbackOpenRouterPage openRouterCode={openRouterCode as (string | undefined)} />
+    </AppLayout>
+  );
+}
@@ -15,8 +15,7 @@ import { useChatLLMDropdown } from '../chat/components/applayout/useLLMDropdown'

 import { EXPERIMENTAL_speakTextStream } from '~/modules/elevenlabs/elevenlabs.client';
 import { SystemPurposeId, SystemPurposes } from '../../data';
-import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-import { streamChat } from '~/modules/llms/transports/streamChat';
+import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
 import { useElevenLabsVoiceDropdown } from '~/modules/elevenlabs/useElevenLabsVoiceDropdown';

 import { Link } from '~/common/components/Link';
@@ -216,7 +215,7 @@ export function CallUI(props: {
    responseAbortController.current = new AbortController();
    let finalText = '';
    let error: any | null = null;
-    streamChat(chatLLMId, callPrompt, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
+    llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
      const text = updatedMessage.text?.trim();
      if (text) {
        finalText = text;
@@ -3,7 +3,7 @@ import * as React from 'react';
 import { Chip, ColorPaletteProp, VariantProp } from '@mui/joy';
 import { SxProps } from '@mui/joy/styles/types';

-import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import type { VChatMessageIn } from '~/modules/llms/llm.client';


 export function CallMessage(props: {
@@ -23,14 +23,26 @@ function AppBarLLMDropdown(props: {
  const llmItems: DropdownItems = {};
  let prevSourceId: DModelSourceId | null = null;
  for (const llm of props.llms) {
-    if (!llm.hidden || llm.id === props.chatLlmId) {
-      if (!prevSourceId || llm.sId !== prevSourceId) {
-        if (prevSourceId)
-          llmItems[`sep-${llm.id}`] = { type: 'separator', title: llm.sId };
-        prevSourceId = llm.sId;
-      }
-      llmItems[llm.id] = { title: llm.label };
+
+    // filter-out hidden models
+    if (!(!llm.hidden || llm.id === props.chatLlmId))
+      continue;
+
+    // add separators when changing sources
+    if (!prevSourceId || llm.sId !== prevSourceId) {
+      if (prevSourceId)
+        llmItems[`sep-${llm.id}`] = {
+          type: 'separator',
+          title: llm.sId,
+        };
+      prevSourceId = llm.sId;
    }
+
+    // add the model item
+    llmItems[llm.id] = {
+      title: llm.label,
+      // icon: llm.id.startsWith('some vendor') ? <VendorIcon /> : undefined,
+    };
  }

  const handleChatLLMChange = (_event: any, value: DLLMId | null) => value && props.setChatLlmId(value);
@@ -331,7 +331,8 @@ export function Composer(props: {

  const handleOverlayDragOver = React.useCallback((e: React.DragEvent) => {
    eatDragEvent(e);
-    // e.dataTransfer.dropEffect = 'copy';
+    // this makes sure we don't "transfer" (or move) the attachment, but we tell the sender we'll copy it
+    e.dataTransfer.dropEffect = 'copy';
  }, [eatDragEvent]);

  const handleOverlayDrop = React.useCallback(async (event: React.DragEvent) => {
@@ -254,7 +254,7 @@ export async function attachmentPerformConversion(attachment: Readonly<Attachmen
    case 'rich-text-table':
      let mdTable: string;
      try {
-        mdTable = htmlTableToMarkdown(input.altData!);
+        mdTable = htmlTableToMarkdown(input.altData!, false);
      } catch (error) {
        // fallback to text/plain
        mdTable = inputDataToString(input.data);
@@ -167,6 +167,8 @@ function explainErrorInMessage(text: string, isAssistant: boolean, modelId?: str
      make sure the usage is under <Link noLinkStyle href='https://platform.openai.com/account/billing/limits' target='_blank'>the limits</Link>.
    </>;
  }
+  // else
+  //  errorMessage = <>{text || 'Unknown error'}</>;

  return { errorMessage, isAssistantError };
 }
@@ -2,8 +2,8 @@ import { DLLMId } from '~/modules/llms/store-llms';
 import { SystemPurposeId } from '../../../data';
 import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
 import { autoTitle } from '~/modules/aifn/autotitle/autoTitle';
+import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
 import { speakText } from '~/modules/elevenlabs/elevenlabs.client';
-import { streamChat } from '~/modules/llms/transports/streamChat';

 import { DMessage, useChatStore } from '~/common/state/store-chats';

@@ -63,7 +63,7 @@ async function streamAssistantMessage(
  const messages = history.map(({ role, text }) => ({ role, content: text }));

  try {
-    await streamChat(llmId, messages, abortSignal,
+    await llmStreamingChatGenerate(llmId, messages, null, null, abortSignal,
      (updatedMessage: Partial<DMessage>) => {
        // update the message in the store (and thus schedule a re-render)
        editMessage(updatedMessage);
@@ -78,14 +78,14 @@ export function AppNews() {

        {!!news && <Container disableGutters maxWidth='sm'>
          {news?.map((ni, idx) => {
-            const firstCard = idx === 0;
+            // const firstCard = idx === 0;
            const hasCardAfter = news.length < NewsItems.length;
            const showExpander = hasCardAfter && (idx === news.length - 1);
            const addPadding = false; //!firstCard; // || showExpander;
            return <Card key={'news-' + idx} sx={{ mb: 2, minHeight: 32 }}>
              <CardContent sx={{ position: 'relative', pr: addPadding ? 4 : 0 }}>
-                <Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 1 }}>
-                  <GoodTooltip title={ni.versionName || null} placement='top-start'>
+                <Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 0 }}>
+                  <GoodTooltip title={ni.versionName ? `${ni.versionName} ${ni.versionMoji || ''}` : null} placement='top-start'>
                    <Typography level='title-sm' component='div' sx={{ flexGrow: 1 }}>
                      {ni.text ? ni.text : ni.versionName ? `${ni.versionCode} · ${ni.versionName}` : `Version ${ni.versionCode}:`}
                    </Typography>
@@ -10,10 +10,10 @@ import { platformAwareKeystrokes } from '~/common/components/KeyStroke';


 // update this variable every time you want to broadcast a new version to clients
-export const incrementalVersion: number = 8;
+export const incrementalVersion: number = 9;

 const B = (props: { href?: string, children: React.ReactNode }) => {
-  const boldText = <Typography color={!!props.href ? 'primary' : 'warning'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
+  const boldText = <Typography color={!!props.href ? 'primary' : 'neutral'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
  return props.href ?
    <Link href={props.href + clientUtmSource()} target='_blank' sx={{ /*textDecoration: 'underline'*/ }}>{boldText} <LaunchIcon sx={{ ml: 1 }} /></Link> :
    boldText;
@@ -27,11 +27,12 @@ const RIssues = `${OpenRepo}/issues`;
 export const newsCallout =
  <Card>
    <CardContent sx={{ gap: 2 }}>
-      <Typography level='h4'>
+      <Typography level='title-lg'>
        Open Roadmap
      </Typography>
-      <Typography>
-        The roadmap is officially out. For the first time you get a look at what&apos;s brewing, up and coming, and get a chance to pick up cool features!
+      <Typography level='body-md'>
+        Take a peek at our roadmap to see what&apos;s in the pipeline.
+        Discover upcoming features and let us know what excites you the most!
      </Typography>
      <Grid container spacing={1}>
        <Grid xs={12} sm={7}>
@@ -39,7 +40,7 @@ export const newsCallout =
            fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
            component={Link} href={OpenProject} noLinkStyle target='_blank'
          >
-            Explore the Roadmap
+            Explore
          </Button>
        </Grid>
        <Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
@@ -66,10 +67,28 @@ export const NewsItems: NewsItem[] = [
    // phone calls
    ],
  },*/
+  {
+    versionCode: '1.8.0',
+    versionName: 'To The Moon And Back',
+    versionMoji: '🚀🌕🔙❤️',
+    versionDate: new Date('2023-12-20T09:30:00Z'),
+    items: [
+      { text: <><B href={RIssues + '/275'}>Google Gemini</B> models support</> },
+      { text: <><B href={RIssues + '/273'}>Mistral Platform</B> support</> },
+      { text: <><B href={RIssues + '/270'}>Ollama chats</B> perfection</> },
+      { text: <>Custom <B href={RIssues + '/280'}>diagrams instructions</B> (@joriskalz)</> },
+      { text: <><B>Single-Tab</B> mode, enhances data integrity and prevents DB corruption</> },
+      { text: <>Updated Ollama (v0.1.17) and OpenRouter models</> },
+      { text: <>More: fixed ⌘ shortcuts on Mac</> },
+      { text: <><Link href='https://big-agi.com'>Website</Link>: official downloads</> },
+      { text: <>Easier Vercel deployment, documented <Link href='https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483'>network troubleshooting</Link></>, dev: true },
+    ],
+  },
  {
    versionCode: '1.7.0',
    versionName: 'Attachment Theory',
-    versionDate: new Date('2023-12-10T12:00:00Z'), // new Date().toISOString()
+    // versionDate: new Date('2023-12-11T06:00:00Z'), // 1.7.3
+    versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
    items: [
      { text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
      { text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
@@ -158,6 +177,7 @@ export const NewsItems: NewsItem[] = [
 interface NewsItem {
  versionCode: string;
  versionName?: string;
+  versionMoji?: string;
  versionDate?: Date;
  text?: string | React.JSX.Element;
  items?: {
@@ -1,14 +1,13 @@
 import * as React from 'react';
 import { shallow } from 'zustand/shallow';
-import { useRouter } from 'next/router';

+import { navigateToNews } from '~/common/app.routes';
 import { useAppStateStore } from '~/common/state/store-appstate';

 import { incrementalVersion } from './news.data';


 export function useShowNewsOnUpdate() {
-  const { push: routerPush } = useRouter();
  const { usageCount, lastSeenNewsVersion } = useAppStateStore(state => ({
    usageCount: state.usageCount,
    lastSeenNewsVersion: state.lastSeenNewsVersion,
@@ -17,9 +16,9 @@ export function useShowNewsOnUpdate() {
    const isNewsOutdated = (lastSeenNewsVersion || 0) < incrementalVersion;
    if (isNewsOutdated && usageCount > 2) {
      // Disable for now
-      void routerPush('/news');
+      void navigateToNews();
    }
-  }, [lastSeenNewsVersion, routerPush, usageCount]);
+  }, [lastSeenNewsVersion, usageCount]);
 }

 export function useMarkNewsAsSeen() {
@@ -1,7 +1,7 @@
 import * as React from 'react';

 import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
-import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 export interface LLMChainStep {
@@ -80,7 +80,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
    _chainAbortController.signal.addEventListener('abort', globalToStepListener);

    // LLM call
-    callChatGenerate(llmId, llmChatInput, chain.overrideResponseTokens)
+    llmChatGenerateOrThrow(llmId, llmChatInput, null, null, chain.overrideResponseTokens)
      .then(({ content }) => {
        stepDone = true;
        if (!stepAbortController.signal.aborted)
@@ -7,21 +7,37 @@
 import Router from 'next/router';

 import type { DConversationId } from '~/common/state/store-chats';
+import { isBrowser } from './util/pwaUtils';


 export const ROUTE_INDEX = '/';
 export const ROUTE_APP_CHAT = '/';
 export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
 export const ROUTE_APP_NEWS = '/news';
+const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';

-export const getIndexLink = () => ROUTE_INDEX;
+
+// Get Paths
+
+export const getCallbackUrl = (source: 'openrouter') => {
+  const callbackUrl = new URL(window.location.href);
+  switch (source) {
+    case 'openrouter':
+      callbackUrl.pathname = ROUTE_CALLBACK_OPENROUTER;
+      break;
+    default:
+      throw new Error(`Unknown source: ${source}`);
+  }
+  return callbackUrl.toString();
+};

 export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);

-const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
-  Router[replace ? 'replace' : 'push'](path);
+
+/// Simple Navigation

 export const navigateToIndex = navigateFn(ROUTE_INDEX);
+
 export const navigateToChat = async (conversationId?: DConversationId) => {
  if (conversationId) {
    await Router.push(
@@ -41,6 +57,15 @@ export const navigateToNews = navigateFn(ROUTE_APP_NEWS);

 export const navigateBack = Router.back;

+export const reloadPage = () => isBrowser && window.location.reload();
+
+function navigateFn(path: string) {
+  return (replace?: boolean): Promise<boolean> => Router[replace ? 'replace' : 'push'](path);
+}
+
+
+/// Launch Apps
+
 export interface AppCallQueryParams {
  conversationId: string;
  personaId: string;
@@ -46,6 +46,7 @@ export const appTheme = extendTheme({
        text: {
          icon: 'var(--joy-palette-neutral-700)',           // <IconButton color='neutral' /> icon color
          secondary: 'var(--joy-palette-neutral-800)',      // increase contrast a bit
+          // tertiary: 'var(--joy-palette-neutral-700)',       // increase contrast a bit
        },
        // popup [white] > surface [50] > level1 [100] > level2 [200] > level3 [300] > body [white -> 400]
        background: {
@@ -23,7 +23,7 @@ export function GoodModal(props: {
  const showBottomClose = !!props.onClose && props.hideBottomClose !== true;
  return (
    <Modal open={props.open} onClose={props.onClose}>
-      <ModalOverflow>
+      <ModalOverflow sx={{p:1}}>
        <ModalDialog
          sx={{
            minWidth: { xs: 360, sm: 500, md: 600, lg: 700 },
@@ -0,0 +1,10 @@
+import * as React from 'react';
+
+import { SvgIcon } from '@mui/joy';
+import { SxProps } from '@mui/joy/styles/types';
+
+export function MistralIcon(props: { sx?: SxProps }) {
+  return <SvgIcon viewBox='0 0 24 24' width='24' height='24' strokeWidth={0} stroke='none' fill='currentColor' strokeLinecap='butt' strokeLinejoin='miter' {...props}>
+    <path d='m 2,2 v 4 4 V 14 v 4 4 h 4 v -4 -4 h 4 v 4 h 4 v -4 h 4 v 4 4 h 4 v -4 -4 -4 -4 V 2 h -4 v 4 h -4 v 4 h -4 v -4 H 6 V 2 Z' />
+  </SvgIcon>;
+}
@@ -21,8 +21,13 @@ export const useGlobalShortcut = (shortcutKey: string | false, useCtrl: boolean,
    if (!shortcutKey) return;
    const lcShortcut = shortcutKey.toLowerCase();
    const handleKeyDown = (event: KeyboardEvent) => {
-      if ((useCtrl === event.ctrlKey) && (useShift === event.shiftKey) && (useAlt === event.altKey)
-        && event.key.toLowerCase() === lcShortcut) {
+      const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
+      if (
+        (useCtrl === isCtrlOrCmd) &&
+        (useShift === event.shiftKey) &&
+        (useAlt === event.altKey) &&
+        event.key.toLowerCase() === lcShortcut
+      ) {
        event.preventDefault();
        event.stopPropagation();
        callback();
@@ -46,9 +51,10 @@ export const useGlobalShortcuts = (shortcuts: GlobalShortcutItem[]) => {
  React.useEffect(() => {
    const handleKeyDown = (event: KeyboardEvent) => {
      for (const [key, useCtrl, useShift, useAlt, action] of shortcuts) {
+        const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
        if (
          key &&
-          (useCtrl === event.ctrlKey) &&
+          (useCtrl === isCtrlOrCmd) &&
          (useShift === event.shiftKey) &&
          (useAlt === event.altKey) &&
          event.key.toLowerCase() === key.toLowerCase()
@@ -0,0 +1,95 @@
+import * as React from 'react';
+
+/**
+ * The AloneDetector class checks if the current client is the only one present for a given app. It uses
+ * BroadcastChannel to talk to other clients. If no other clients reply within a short time, it assumes it's
+ * the only one and tells the caller.
+ */
+class AloneDetector {
+  private readonly clientId: string;
+  private readonly broadcastChannel: BroadcastChannel;
+
+  private aloneCallback: ((isAlone: boolean) => void) | null;
+  private aloneTimerId: number | undefined;
+
+  constructor(channelName: string, onAlone: (isAlone: boolean) => void) {
+
+    this.clientId = Math.random().toString(36).substring(2, 10);
+    this.aloneCallback = onAlone;
+
+    this.broadcastChannel = new BroadcastChannel(channelName);
+    this.broadcastChannel.onmessage = this.handleIncomingMessage;
+
+  }
+
+  public onUnmount(): void {
+    // close channel
+    this.broadcastChannel.onmessage = null;
+    this.broadcastChannel.close();
+
+    // clear timeout
+    if (this.aloneTimerId)
+      clearTimeout(this.aloneTimerId);
+
+    this.aloneTimerId = undefined;
+    this.aloneCallback = null;
+  }
+
+  public checkIfAlone(): void {
+
+    // triggers other clients
+    this.broadcastChannel.postMessage({ type: 'CHECK', sender: this.clientId });
+
+    // if no response within 500ms, assume this client is alone
+    this.aloneTimerId = window.setTimeout(() => {
+      this.aloneTimerId = undefined;
+      this.aloneCallback?.(true);
+    }, 500);
+
+  }
+
+  private handleIncomingMessage = (event: MessageEvent): void => {
+
+    // ignore self messages
+    if (event.data.sender === this.clientId) return;
+
+    switch (event.data.type) {
+
+      case 'CHECK':
+        this.broadcastChannel.postMessage({ type: 'ALIVE', sender: this.clientId });
+        break;
+
+      case 'ALIVE':
+        // received an ALIVE message, tell the client they're not alone
+        if (this.aloneTimerId) {
+          clearTimeout(this.aloneTimerId);
+          this.aloneTimerId = undefined;
+        }
+        this.aloneCallback?.(false);
+        this.aloneCallback = null;
+        break;
+
+    }
+  };
+}
+
+
+/**
+ * React hook that checks whether the current tab is the only one open for a specific channel.
+ *
+ * @param {string} channelName - The name of the BroadcastChannel to communicate on.
+ * @returns {boolean | null} - True if the current tab is alone, false if not, or null before the check completes.
+ */
+export function useSingleTabEnforcer(channelName: string): boolean | null {
+  const [isAlone, setIsAlone] = React.useState<boolean | null>(null);
+
+  React.useEffect(() => {
+    const tabManager = new AloneDetector(channelName, setIsAlone);
+    tabManager.checkIfAlone();
+    return () => {
+      tabManager.onUnmount();
+    };
+  }, [channelName]);
+
+  return isAlone;
+}
@@ -9,6 +9,7 @@ export type DropdownItems = Record<string, {
  title: string,
  symbol?: string,
  type?: 'separator'
+  icon?: React.ReactNode,
 }>;


@@ -71,20 +72,25 @@ export function AppBarDropdown<TValue extends string>(props: {
    {!!props.prependOption && Object.keys(props.items).length >= 1 && <Divider />}

    <Box sx={{ overflowY: 'auto' }}>
-      {Object.keys(props.items).map((key: string, idx: number) => <React.Fragment key={'key-' + idx}>
-        {props.items[key].type === 'separator'
-          ? <ListDivider />
-          : <Option value={key} sx={{ whiteSpace: 'nowrap' }}>
-            {props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{props.items[key]?.symbol + ' '}</ListItemDecorator>}
-            {props.items[key].title}
+      {Object.keys(props.items).map((key: string, idx: number) => {
+        const item = props.items[key];
+
+        if (item.type === 'separator')
+          return <ListDivider key={'key-' + idx} />;
+
+        return (
+          <Option key={'key-' + idx} value={key} sx={{ whiteSpace: 'nowrap' }}>
+            {props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{item?.symbol + ' '}</ListItemDecorator>}
+            {props.showSymbols && !!item.icon && <ListItemDecorator>{item?.icon}</ListItemDecorator>}
+            {item.title}
            {/*{key === props.value && (*/}
            {/*  <IconButton variant='soft' onClick={() => alert('aa')} sx={{ ml: 'auto' }}>*/}
            {/*    <SettingsIcon color='success' />*/}
            {/*  </IconButton>*/}
            {/*)}*/}
          </Option>
-        }
-      </React.Fragment>)}
+        );
+      })}
    </Box>

    {!!props.appendOption && Object.keys(props.items).length >= 1 && <ListDivider />}
@@ -3,7 +3,7 @@ import { shallow } from 'zustand/shallow';

 import { Box, Container } from '@mui/joy';

-import { ModelsModal } from '../../apps/models-modal/ModelsModal';
+import { ModelsModal } from '~/modules/llms/models-modal/ModelsModal';
 import { SettingsModal } from '../../apps/settings-modal/SettingsModal';
 import { ShortcutsModal } from '../../apps/settings-modal/ShortcutsModal';

@@ -0,0 +1,42 @@
+import * as React from 'react';
+
+import { Button, Sheet, Typography } from '@mui/joy';
+
+import { Brand } from '../app.config';
+import { reloadPage } from '../app.routes';
+import { useSingleTabEnforcer } from '../components/useSingleTabEnforcer';
+
+
+export const ProviderSingleTab = (props: { children: React.ReactNode }) => {
+
+  // state
+  const isSingleTab = useSingleTabEnforcer('big-agi-tabs');
+
+  // pass-through until we know for sure that other tabs are open
+  if (isSingleTab === null || isSingleTab)
+    return props.children;
+
+
+  return (
+    <Sheet
+      variant='solid'
+      invertedColors
+      sx={{
+        flexGrow: 1,
+        display: 'flex', flexDirection: { xs: 'column', md: 'row' }, justifyContent: 'center', alignItems: 'center', gap: 2,
+        p: 3,
+      }}
+    >
+
+      <Typography>
+        It looks like {Brand.Title.Base} is already running in another tab or window.
+        To continue here, please close the other instance first.
+      </Typography>
+
+      <Button onClick={reloadPage}>
+        Reload
+      </Button>
+
+    </Sheet>
+  );
+};
@@ -2,11 +2,13 @@
 * @fileoverview Utility functions for Markdown.
 */

+import { isBrowser } from '~/common/util/pwaUtils';
+
 /**
 * Quick and dirty conversion of HTML tables to Markdown tables.
 * Big plus: doesn't require any dependencies.
 */
-export function htmlTableToMarkdown(html: string): string {
+export function htmlTableToMarkdown(html: string, includeInvisible: boolean): string {
  const parser = new DOMParser();
  const doc = parser.parseFromString(html, 'text/html');
  const table = doc.querySelector('table');
@@ -16,20 +18,53 @@ export function htmlTableToMarkdown(html: string): string {
  const headerCells = table.querySelectorAll('thead th');
  if (headerCells.length > 0) {
    const headerRow = '| ' + Array.from(headerCells)
-      .map(cell => cell.textContent?.trim() || '')
-      .join(' | ') + '| ';
+      .map(cell => getTextWithSpaces(cell, includeInvisible).trim())
+      .join(' | ') + ' |';
    markdownRows.push(headerRow);
-    markdownRows.push('|:' + Array(headerCells.length).fill('-').join('|:') + '|');
+    markdownRows.push('|:' + Array(headerCells.length).fill('---').join('|:') + '|');
  }

  const bodyRows = table.querySelectorAll('tbody tr');
  for (const row of Array.from(bodyRows)) {
    const rowCells = row.querySelectorAll('td');
    const markdownRow = '| ' + Array.from(rowCells)
-      .map(cell => cell.textContent?.trim() || '')
+      .map(cell => getTextWithSpaces(cell, includeInvisible).trim())
      .join(' | ') + ' |';
    markdownRows.push(markdownRow);
  }

  return markdownRows.join('\n');
+}
+
+// Helper function to get text with spaces, ignoring hidden elements
+function getTextWithSpaces(node: Node, includeInvisible: boolean): string {
+  let text = '';
+  node.childNodes.forEach(child => {
+    if (child.nodeType === Node.TEXT_NODE)
+      text += child.textContent;
+    else if (child.nodeType === Node.ELEMENT_NODE)
+      if (includeInvisible || isVisible(child as Element))
+        text += ' ' + getTextWithSpaces(child, includeInvisible) + ' ';
+  });
+  return text;
+}
+
+// Helper function to determine if an element is visible
+function isVisible(element: Element): boolean {
+  if (!isBrowser) return true;
+
+  // if the cell is hidden, don't include it
+  const style = window.getComputedStyle(element);
+  if (style.display === 'none' || style.visibility === 'hidden')
+    return false;
+
+  // Check for common classes used to hide content or indicate tooltip/popover content.
+  // You may need to add more classes here based on your actual HTML/CSS.
+  const ignoredClasses = ['hidden', 'group-hover', 'tooltip', 'pointer-events-none', 'opacity-0'];
+  for (const ignoredClass of ignoredClasses)
+    if (element.classList.contains(ignoredClass))
+      return false;
+
+  // Otherwise, the element is considered visible
+  return true;
 }
@@ -14,7 +14,7 @@ export async function pdfToText(pdfBuffer: ArrayBuffer): Promise<string> {
  const { getDocument, GlobalWorkerOptions } = await import('pdfjs-dist');

  // Set the worker script path
-  GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.js';
+  GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.mjs';

  const pdf = await getDocument(pdfBuffer).promise;
  const textPages: string[] = []; // Initialize an array to hold text from all pages
@@ -1,4 +1,4 @@
-import { callChatGenerateWithFunctions, VChatFunctionIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';

 import { useChatStore } from '~/common/state/store-chats';
@@ -71,7 +71,7 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri

  // Follow-up: Question
  if (suggestQuestions) {
-    // callChatGenerateWithFunctions(funcLLMId, [
+    // llmChatGenerateOrThrow(funcLLMId, [
    //     { role: 'system', content: systemMessage.text },
    //     { role: 'user', content: userMessage.text },
    //     { role: 'assistant', content: assistantMessageText },
@@ -83,15 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri

  // Follow-up: Auto-Diagrams
  if (suggestDiagrams) {
-    void callChatGenerateWithFunctions(funcLLMId, [
+    void llmChatGenerateOrThrow(funcLLMId, [
        { role: 'system', content: systemMessage.text },
        { role: 'user', content: userMessage.text },
        { role: 'assistant', content: assistantMessageText },
      ], [suggestPlantUMLFn], 'draw_plantuml_diagram',
    ).then(chatResponse => {

+      if (!('function_arguments' in chatResponse))
+        return;
+
      // parse the output PlantUML string, if any
-      const functionArguments = chatResponse?.function_arguments ?? null;
+      const functionArguments = chatResponse.function_arguments ?? null;
      if (functionArguments) {
        const { code, type }: { code: string, type: string } = functionArguments as any;
        if (code && type) {
@@ -105,6 +108,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
          editMessage(conversationId, assistantMessageId, { text: assistantMessageText }, false);
        }
      }
+    }).catch(err => {
+      console.error('autoSuggestions::diagram:', err);
    });
  }

@@ -1,4 +1,4 @@
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';

 import { useChatStore } from '~/common/state/store-chats';
@@ -27,7 +27,7 @@ export function autoTitle(conversationId: string) {
  });

  // LLM
-  void callChatGenerate(fastLLMId, [
+  void llmChatGenerateOrThrow(fastLLMId, [
    { role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
    {
      role: 'user', content:
@@ -39,7 +39,7 @@ export function autoTitle(conversationId: string) {
        historyLines.join('\n') +
        '```\n',
    },
-  ]).then(chatResponse => {
+  ], null, null).then(chatResponse => {

    const title = chatResponse?.content
      ?.trim()
@@ -1,6 +1,6 @@
 import * as React from 'react';

-import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton } from '@mui/joy';
+import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton, Input, FormControl, FormLabel } from '@mui/joy';
 import AccountTreeIcon from '@mui/icons-material/AccountTree';
 import ExpandLessIcon from '@mui/icons-material/ExpandLess';
 import ExpandMoreIcon from '@mui/icons-material/ExpandMore';
@@ -8,8 +8,9 @@ import ReplayIcon from '@mui/icons-material/Replay';
 import StopOutlinedIcon from '@mui/icons-material/StopOutlined';
 import TelegramIcon from '@mui/icons-material/Telegram';

+import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
+
 import { ChatMessage } from '../../../apps/chat/components/message/ChatMessage';
-import { streamChat } from '~/modules/llms/transports/streamChat';

 import { GoodModal } from '~/common/components/GoodModal';
 import { InlineError } from '~/common/components/InlineError';
@@ -48,6 +49,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
  const [message, setMessage] = React.useState<DMessage | null>(null);
  const [diagramType, diagramComponent] = useFormRadio<DiagramType>('auto', diagramTypes, 'Visualize');
  const [diagramLanguage, languageComponent] = useFormRadio<DiagramLanguage>('plantuml', diagramLanguages, 'Style');
+  const [customInstruction, setCustomInstruction] = React.useState<string>('');
  const [errorMessage, setErrorMessage] = React.useState<string | null>(null);
  const [abortController, setAbortController] = React.useState<AbortController | null>(null);

@@ -81,10 +83,10 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
    const stepAbortController = new AbortController();
    setAbortController(stepAbortController);

-    const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject);
+    const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);

    try {
-      await streamChat(diagramLlm.id, diagramPrompt, stepAbortController.signal,
+      await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
        (update: Partial<{ text: string, typing: boolean, originLLM: string }>) => {
          assistantMessage = { ...assistantMessage, ...update };
          setMessage(assistantMessage);
@@ -103,7 +105,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
      setAbortController(null);
    }

-  }, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject]);
+  }, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);


  // [Effect] Auto-abort on unmount
@@ -149,6 +151,12 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
        <Grid xs={12} xl={6}>
          {llmComponent}
        </Grid>
+        <Grid xs={12} md={6}>
+            <FormControl>
+              <FormLabel>Custom Instruction</FormLabel>
+              <Input title="Custom Instruction" placeholder='e.g. visualize as state' value={customInstruction}  onChange={(e) => setCustomInstruction(e.target.value)} />
+            </FormControl>
+          </Grid>
      </Grid>
    )}

@@ -1,6 +1,5 @@
-import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-
 import type { FormRadioOption } from '~/common/components/forms/FormRadioControl';
+import type { VChatMessageIn } from '~/modules/llms/llm.client';


 export type DiagramType = 'auto' | 'mind';
@@ -60,12 +59,15 @@ function plantumlDiagramPrompt(diagramType: DiagramType): { sys: string, usr: st
  }
 }

-export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string): VChatMessageIn[] {
+export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string, customInstruction: string): VChatMessageIn[] {
  const { sys, usr } = diagramLanguage === 'mermaid' ? mermaidDiagramPrompt(diagramType) : plantumlDiagramPrompt(diagramType);
+  if (customInstruction) {
+    customInstruction = 'Also consider the following instructions: ' + customInstruction;
+  }
  return [
    { role: 'system', content: sys },
    { role: 'system', content: chatSystemPrompt },
    { role: 'assistant', content: subject },
-    { role: 'user', content: usr },
+    { role: 'user', content: `${usr} ${customInstruction}` },
  ];
 }
@@ -1,4 +1,4 @@
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
 import { useModelsStore } from '~/modules/llms/store-llms';


@@ -14,10 +14,10 @@ export async function imaginePromptFromText(messageText: string): Promise<string
  const { fastLLMId } = useModelsStore.getState();
  if (!fastLLMId) return null;
  try {
-    const chatResponse = await callChatGenerate(fastLLMId, [
+    const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
      { role: 'system', content: simpleImagineSystemPrompt },
      { role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
-    ]);
+    ], null, null);
    return chatResponse.content?.trim() ?? null;
  } catch (error: any) {
    console.error('imaginePromptFromText: fetch request error:', error);
@@ -5,7 +5,7 @@
 import { DLLMId } from '~/modules/llms/store-llms';
 import { callApiSearchGoogle } from '~/modules/google/search.client';
 import { callBrowseFetchPage } from '~/modules/browse/browse.client';
-import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';


 // prompt to implement the ReAct paradigm: https://arxiv.org/abs/2210.03629
@@ -128,7 +128,7 @@ export class Agent {
    S.messages.push({ role: 'user', content: prompt });
    let content: string;
    try {
-      content = (await callChatGenerate(llmId, S.messages, 500)).content;
+      content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
    } catch (error: any) {
      content = `Error in callChat: ${error}`;
    }
@@ -1,5 +1,5 @@
 import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
-import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
+import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';


 // prompt to be tried when doing recursive summerization.
@@ -80,10 +80,10 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
  const autoResponseTokensSize = Math.floor(contextTokens * outputTokenShare);

  try {
-    const chatResponse = await callChatGenerate(llmId, [
+    const chatResponse = await llmChatGenerateOrThrow(llmId, [
      { role: 'system', content: cleanupPrompt },
      { role: 'user', content: chunk },
-    ], autoResponseTokensSize);
+    ], null, null, autoResponseTokensSize);
    return chatResponse?.content ?? '';
  } catch (error: any) {
    return '';
@@ -1,8 +1,7 @@
 import * as React from 'react';

 import type { DLLMId } from '~/modules/llms/store-llms';
-import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
-import { streamChat } from '~/modules/llms/transports/streamChat';
+import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';


 export function useStreamChatText() {
@@ -25,7 +24,7 @@ export function useStreamChatText() {

    try {
      let lastText = '';
-      await streamChat(llmId, prompt, abortControllerRef.current.signal, (update) => {
+      await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, (update) => {
        if (update.text) {
          lastText = update.text;
          setPartialText(lastText);
@@ -1,5 +1,10 @@
+import { z } from 'zod';
+
+import type { BackendCapabilities } from '~/modules/backend/state-backend';
+
 import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
+import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';

 import { analyticsListCapabilities } from './backend.analytics';

@@ -23,11 +28,26 @@ export const backendRouter = createTRPCRouter({
        hasImagingProdia: !!env.PRODIA_API_KEY,
        hasLlmAnthropic: !!env.ANTHROPIC_API_KEY,
        hasLlmAzureOpenAI: !!env.AZURE_OPENAI_API_KEY && !!env.AZURE_OPENAI_API_ENDPOINT,
+        hasLlmGemini: !!env.GEMINI_API_KEY,
+        hasLlmMistral: !!env.MISTRAL_API_KEY,
        hasLlmOllama: !!env.OLLAMA_API_HOST,
        hasLlmOpenAI: !!env.OPENAI_API_KEY || !!env.OPENAI_API_HOST,
        hasLlmOpenRouter: !!env.OPENROUTER_API_KEY,
        hasVoiceElevenLabs: !!env.ELEVENLABS_API_KEY,
-      };
+      } satisfies BackendCapabilities;
+    }),
+
+
+  // The following are used for various OAuth integrations
+
+  /* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
+  exchangeOpenRouterKey: publicProcedure
+    .input(z.object({ code: z.string() }))
+    .query(async ({ input }) => {
+      // Documented here: https://openrouter.ai/docs#oauth
+      return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
+        code: input.code,
+      }, 'Backend.exchangeOpenRouterKey');
    }),

 });
@@ -9,6 +9,8 @@ export interface BackendCapabilities {
  hasImagingProdia: boolean;
  hasLlmAnthropic: boolean;
  hasLlmAzureOpenAI: boolean;
+  hasLlmGemini: boolean;
+  hasLlmMistral: boolean;
  hasLlmOllama: boolean;
  hasLlmOpenAI: boolean;
  hasLlmOpenRouter: boolean;
@@ -30,6 +32,8 @@ const useBackendStore = create<BackendStore>()(
    hasImagingProdia: false,
    hasLlmAnthropic: false,
    hasLlmAzureOpenAI: false,
+    hasLlmGemini: false,
+    hasLlmMistral: false,
    hasLlmOllama: false,
    hasLlmOpenAI: false,
    hasLlmOpenRouter: false,
@@ -1,4 +1,4 @@
-import create from 'zustand';
+import { create } from 'zustand';
 import { persist } from 'zustand/middleware';

 import { CapabilityBrowsing } from '~/common/components/useCapabilities';
@@ -0,0 +1,74 @@
+import type { DLLMId } from './store-llms';
+import type { OpenAIWire } from './server/openai/openai.wiretypes';
+import { findVendorForLlmOrThrow } from './vendors/vendors.registry';
+
+
+// LLM Client Types
+// NOTE: Model List types in '../server/llm.server.types';
+
+export interface VChatMessageIn {
+  role: 'assistant' | 'system' | 'user'; // | 'function';
+  content: string;
+  //name?: string; // when role: 'function'
+}
+
+export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
+
+export interface VChatMessageOut {
+  role: 'assistant' | 'system' | 'user';
+  content: string;
+  finish_reason: 'stop' | 'length' | null;
+}
+
+export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
+  function_name: string;
+  function_arguments: object | null;
+}
+
+
+// LLM Client Functions
+
+export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  llmId: DLLMId,
+  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+  maxTokens?: number,
+): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {
+
+  // id to DLLM and vendor
+  const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
+
+  // FIXME: relax the forced cast
+  const options = llm.options as TLLMOptions;
+
+  // get the access
+  const partialSourceSetup = llm._source.setup;
+  const access = vendor.getTransportAccess(partialSourceSetup);
+
+  // execute via the vendor
+  return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
+}
+
+
+export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  llmId: DLLMId,
+  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null,
+  forceFunctionName: string | null,
+  abortSignal: AbortSignal,
+  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
+): Promise<void> {
+
+  // id to DLLM and vendor
+  const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
+
+  // FIXME: relax the forced cast
+  const llmOptions = llm.options as TLLMOptions;
+
+  // get the access
+  const partialSourceSetup = llm._source.setup;
+  const access = vendor.getTransportAccess(partialSourceSetup); // as ChatStreamInputSchema['access'];
+
+  // execute via the vendor
+  return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
+}
@@ -7,7 +7,7 @@ import VisibilityIcon from '@mui/icons-material/Visibility';
 import VisibilityOffIcon from '@mui/icons-material/VisibilityOff';

 import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
-import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
+import { findVendorById } from '~/modules/llms/vendors/vendors.registry';

 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { GoodModal } from '~/common/components/GoodModal';
@@ -117,9 +117,9 @@ export function LLMOptionsModal(props: { id: DLLMId }) {
        <FormLabelStart title='Details' sx={{ minWidth: 80 }} onClick={() => setShowDetails(!showDetails)} />
        {showDetails && <Typography level='body-sm' sx={{ display: 'block' }}>
          [{llm.id}]: {llm.options.llmRef && `${llm.options.llmRef} · `}
-          {llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
-          {llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
-          {llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
+          {!!llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
+          {!!llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
+          {!!llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
          description: {llm.description}
          {/*· tags: {llm.tags.join(', ')}*/}
        </Typography>}
@@ -7,7 +7,7 @@ import VisibilityOffOutlinedIcon from '@mui/icons-material/VisibilityOffOutlined

 import { DLLM, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
 import { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
-import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
+import { findVendorById } from '~/modules/llms/vendors/vendors.registry';

 import { GoodTooltip } from '~/common/components/GoodTooltip';
 import { openLayoutLLMOptions } from '~/common/layout/store-applayout';
@@ -109,8 +109,15 @@ export function ModelsList(props: {
    <List variant='soft' size='sm' sx={{
      borderRadius: 'sm',
      pl: { xs: 0, md: 1 },
+      overflowY: 'auto',
    }}>
-      {items}
+      {items.length > 0 ? items : (
+        <ListItem>
+          <Typography level='body-sm'>
+            Please configure the service and update the list of models.
+          </Typography>
+        </ListItem>
+      )}
    </List>
  );
 }
@@ -4,7 +4,7 @@ import { shallow } from 'zustand/shallow';
 import { Box, Checkbox, Divider } from '@mui/joy';

 import { DModelSource, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
-import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendor.registry';
+import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendors.registry';

 import { GoodModal } from '~/common/components/GoodModal';
 import { closeLayoutModelsSetup, openLayoutModelsSetup, useLayoutModelsSetup } from '~/common/layout/store-applayout';
@@ -65,7 +65,7 @@ export function ModelsModal(props: { suspendAutoModelsSetup?: boolean }) {
      title={<>Configure <b>AI Models</b></>}
      startButton={
        multiSource ? <Checkbox
-          label='all vendors' sx={{ my: 'auto' }}
+          label='All Services' sx={{ my: 'auto' }}
          checked={showAllSources} onChange={() => setShowAllSources(all => !all)}
        /> : undefined
      }
@@ -5,9 +5,9 @@ import { Avatar, Badge, Box, Button, IconButton, ListItemDecorator, MenuItem, Op
 import AddIcon from '@mui/icons-material/Add';
 import DeleteOutlineIcon from '@mui/icons-material/DeleteOutline';

-import { type DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
-import { type IModelVendor, type ModelVendorId } from '~/modules/llms/vendors/IModelVendor';
-import { createModelSourceForVendor, findAllVendors, findVendorById } from '~/modules/llms/vendors/vendor.registry';
+import type { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
+import { DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
+import { createModelSourceForVendor, findAllVendors, findVendorById, ModelVendorId } from '~/modules/llms/vendors/vendors.registry';

 import { CloseableMenu } from '~/common/components/CloseableMenu';
 import { ConfirmationModal } from '~/common/components/ConfirmationModal';
@@ -29,7 +29,7 @@ function vendorIcon(vendor: IModelVendor | null, greenMark: boolean) {
      icon = <vendor.Icon />;
  }
  return (greenMark && icon)
-    ? <Badge color='primary' size='sm' badgeContent=''>{icon}</Badge>
+    ? <Badge color='success' size='sm' badgeContent=''>{icon}</Badge>
    : icon;
 }

@@ -92,7 +92,11 @@ export function ModelsSourceSelector(props: {
              <ListItemDecorator>
                {vendorIcon(vendor, !!vendor.hasBackendCap && vendor.hasBackendCap())}
              </ListItemDecorator>
-              {vendor.name}{/*{sourceCount > 0 && ` (added)`}*/}
+              {vendor.name}
+              {/*{sourceCount > 0 && ` (added)`}*/}
+              {!!vendor.hasFreeModels && ` 🎁`}
+              {/*{!!vendor.instanceLimit && ` (${sourceCount}/${vendor.instanceLimit})`}*/}
+              {vendor.location === 'local' && <span style={{ opacity: 0.5 }}>local</span>}
            </MenuItem>
          ),
        };
@@ -1,6 +1,6 @@
-import type { ModelDescriptionSchema } from '../server.schemas';
+import type { ModelDescriptionSchema } from '../llm.server.types';

-import { LLM_IF_OAI_Chat } from '../../../store-llms';
+import { LLM_IF_OAI_Chat } from '../../store-llms';

 const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);

@@ -6,7 +6,7 @@ import { env } from '~/server/env.mjs';
 import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';

 import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { listModelsOutputSchema } from '../server.schemas';
+import { listModelsOutputSchema } from '../llm.server.types';

 import { AnthropicWire } from './anthropic.wiretypes';
 import { hardcodedAnthropicModels } from './anthropic.models';
@@ -0,0 +1,216 @@
+import { z } from 'zod';
+import { TRPCError } from '@trpc/server';
+import { env } from '~/server/env.mjs';
+
+import packageJson from '../../../../../package.json';
+
+import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
+import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
+
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
+
+import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
+
+import { GeminiBlockSafetyLevel, geminiBlockSafetyLevelSchema, GeminiContentSchema, GeminiGenerateContentRequest, geminiGeneratedContentResponseSchema, geminiModelsGenerateContentPath, geminiModelsListOutputSchema, geminiModelsListPath } from './gemini.wiretypes';
+
+
+// Default hosts
+const DEFAULT_GEMINI_HOST = 'https://generativelanguage.googleapis.com';
+
+
+// Mappers
+
+export function geminiAccess(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string): { headers: HeadersInit, url: string } {
+
+  const geminiKey = access.geminiKey || env.GEMINI_API_KEY || '';
+  const geminiHost = fixupHost(DEFAULT_GEMINI_HOST, apiPath);
+
+  // update model-dependent paths
+  if (apiPath.includes('{model=models/*}')) {
+    if (!modelRefId)
+      throw new Error(`geminiAccess: modelRefId is required for ${apiPath}`);
+    apiPath = apiPath.replace('{model=models/*}', modelRefId);
+  }
+
+  return {
+    headers: {
+      'Content-Type': 'application/json',
+      'x-goog-api-client': `big-agi/${packageJson['version'] || '1.0.0'}`,
+      'x-goog-api-key': geminiKey,
+    },
+    url: geminiHost + apiPath,
+  };
+}
+
+/**
+ * We specially encode the history to match the Gemini API requirements.
+ * Gemini does not want 2 consecutive messages from the same role, so we alternate.
+ *  - System messages = [User, Model'Ok']
+ *  - User and Assistant messages are coalesced into a single message (e.g. [User, User, Assistant, Assistant, User] -> [User[2], Assistant[2], User[1]])
+ */
+export const geminiGenerateContentTextPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, safety: GeminiBlockSafetyLevel, n: number): GeminiGenerateContentRequest => {
+
+  // convert the history to a Gemini format
+  const contents: GeminiContentSchema[] = [];
+  for (const _historyElement of history) {
+
+    const { role: msgRole, content: msgContent } = _historyElement;
+
+    // System message - we treat it as per the example in https://ai.google.dev/tutorials/ai-studio_quickstart#chat_example
+    if (msgRole === 'system') {
+      contents.push({ role: 'user', parts: [{ text: msgContent }] });
+      contents.push({ role: 'model', parts: [{ text: 'Ok' }] });
+      continue;
+    }
+
+    // User or Assistant message
+    const nextRole: GeminiContentSchema['role'] = msgRole === 'assistant' ? 'model' : 'user';
+    if (contents.length && contents[contents.length - 1].role === nextRole) {
+      // coalesce with the previous message
+      contents[contents.length - 1].parts.push({ text: msgContent });
+    } else {
+      // create a new message
+      contents.push({ role: nextRole, parts: [{ text: msgContent }] });
+    }
+  }
+
+  return {
+    contents,
+    generationConfig: {
+      ...(n >= 2 && { candidateCount: n }),
+      ...(model.maxTokens && { maxOutputTokens: model.maxTokens }),
+      temperature: model.temperature,
+    },
+    safetySettings: safety !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED' ? [
+      { category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: safety },
+      { category: 'HARM_CATEGORY_HATE_SPEECH', threshold: safety },
+      { category: 'HARM_CATEGORY_HARASSMENT', threshold: safety },
+      { category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: safety },
+    ] : undefined,
+  };
+};
+
+
+async function geminiGET<TOut extends object>(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+  const { headers, url } = geminiAccess(access, modelRefId, apiPath);
+  return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, 'Gemini');
+}
+
+async function geminiPOST<TOut extends object, TPostBody extends object>(access: GeminiAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
+  const { headers, url } = geminiAccess(access, modelRefId, apiPath);
+  return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, 'Gemini');
+}
+
+
+// Input/Output Schemas
+
+export const geminiAccessSchema = z.object({
+  dialect: z.enum(['gemini']),
+  geminiKey: z.string(),
+  minSafetyLevel: geminiBlockSafetyLevelSchema,
+});
+export type GeminiAccessSchema = z.infer<typeof geminiAccessSchema>;
+
+
+const accessOnlySchema = z.object({
+  access: geminiAccessSchema,
+});
+
+const chatGenerateInputSchema = z.object({
+  access: geminiAccessSchema,
+  model: openAIModelSchema, history: openAIHistorySchema,
+  // functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
+});
+
+
+/**
+ * See https://github.com/google/generative-ai-js/tree/main/packages/main/src for
+ * the official Google implementation.
+ */
+export const llmGeminiRouter = createTRPCRouter({
+
+  /* [Gemini] models.list = /v1beta/models */
+  listModels: publicProcedure
+    .input(accessOnlySchema)
+    .output(listModelsOutputSchema)
+    .query(async ({ input }) => {
+
+      // get the models
+      const wireModels = await geminiGET(input.access, null, geminiModelsListPath);
+      const detailedModels = geminiModelsListOutputSchema.parse(wireModels).models;
+
+      // NOTE: no need to retrieve info for each of the models (e.g. /v1beta/model/gemini-pro).,
+      //       as the List API already all the info on all the models
+
+      // map to our output schema
+      return {
+        models: detailedModels.map((geminiModel) => {
+          const { description, displayName, inputTokenLimit, name, outputTokenLimit, supportedGenerationMethods } = geminiModel;
+
+          const contextWindow = inputTokenLimit + outputTokenLimit;
+          const hidden = !supportedGenerationMethods.includes('generateContent');
+
+          const { version, topK, topP, temperature } = geminiModel;
+          const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;
+
+          // const isGeminiPro = name.includes('gemini-pro');
+          const isGeminiProVision = name.includes('gemini-pro-vision');
+
+          const interfaces: ModelDescriptionSchema['interfaces'] = [];
+          if (supportedGenerationMethods.includes('generateContent')) {
+            interfaces.push(LLM_IF_OAI_Chat);
+            if (isGeminiProVision)
+              interfaces.push(LLM_IF_OAI_Vision);
+          }
+
+          return {
+            id: name,
+            label: displayName,
+            // created: ...
+            // updated: ...
+            description: descriptionLong,
+            contextWindow: contextWindow,
+            maxCompletionTokens: outputTokenLimit,
+            // pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
+            // rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
+            interfaces: supportedGenerationMethods.includes('generateContent') ? [LLM_IF_OAI_Chat] : [],
+            hidden,
+          } satisfies ModelDescriptionSchema;
+        }),
+      };
+    }),
+
+
+  /* [Gemini] models.generateContent = /v1/{model=models/*}:generateContent */
+  chatGenerate: publicProcedure
+    .input(chatGenerateInputSchema)
+    .output(openAIChatGenerateOutputSchema)
+    .mutation(async ({ input: { access, history, model } }) => {
+
+      // generate the content
+      const wireGeneration = await geminiPOST(access, model.id, geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1), geminiModelsGenerateContentPath);
+      const generation = geminiGeneratedContentResponseSchema.parse(wireGeneration);
+
+      // only use the first result (and there should be only one)
+      const singleCandidate = generation.candidates?.[0] ?? null;
+      if (!singleCandidate || !singleCandidate.content?.parts.length)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Gemini chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });
+
+      if (!('text' in singleCandidate.content.parts[0]))
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Gemini non-text chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });
+
+      return {
+        role: 'assistant',
+        content: singleCandidate.content.parts[0].text || '',
+        finish_reason: singleCandidate.finishReason === 'STOP' ? 'stop' : null,
+      };
+    }),
+
+});
@@ -0,0 +1,188 @@
+import { z } from 'zod';
+
+// PATHS
+
+export const geminiModelsListPath = '/v1beta/models?pageSize=1000';
+export const geminiModelsGenerateContentPath = '/v1beta/{model=models/*}:generateContent';
+// see alt=sse on https://cloud.google.com/apis/docs/system-parameters#definitions
+export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:streamGenerateContent?alt=sse';
+
+
+// models.list = /v1beta/models
+
+export const geminiModelsListOutputSchema = z.object({
+  models: z.array(z.object({
+    name: z.string(),
+    version: z.string(),
+    displayName: z.string(),
+    description: z.string(),
+    inputTokenLimit: z.number().int().min(1),
+    outputTokenLimit: z.number().int().min(1),
+    supportedGenerationMethods: z.array(z.enum([
+      'countMessageTokens',
+      'countTextTokens',
+      'countTokens',
+      'createTunedTextModel',
+      'embedContent',
+      'embedText',
+      'generateAnswer',
+      'generateContent',
+      'generateMessage',
+      'generateText',
+    ])),
+    temperature: z.number().optional(),
+    topP: z.number().optional(),
+    topK: z.number().optional(),
+  })),
+});
+
+
+// /v1/{model=models/*}:generateContent, /v1beta/{model=models/*}:streamGenerateContent
+
+// Request
+
+const geminiContentPartSchema = z.union([
+
+  // TextPart
+  z.object({
+    text: z.string().optional(),
+  }),
+
+  // InlineDataPart
+  z.object({
+    inlineData: z.object({
+      mimeType: z.string(),
+      data: z.string(), // base64-encoded string
+    }),
+  }),
+
+  // A predicted FunctionCall returned from the model
+  z.object({
+    functionCall: z.object({
+      name: z.string(),
+      args: z.record(z.any()), // JSON object format
+    }),
+  }),
+
+  // The result output of a FunctionCall
+  z.object({
+    functionResponse: z.object({
+      name: z.string(),
+      response: z.record(z.any()), // JSON object format
+    }),
+  }),
+]);
+
+const geminiToolSchema = z.object({
+  functionDeclarations: z.array(z.object({
+    name: z.string(),
+    description: z.string(),
+    parameters: z.record(z.any()).optional(), // Schema object format
+  })).optional(),
+});
+
+const geminiHarmCategorySchema = z.enum([
+  'HARM_CATEGORY_UNSPECIFIED',
+  'HARM_CATEGORY_DEROGATORY',
+  'HARM_CATEGORY_TOXICITY',
+  'HARM_CATEGORY_VIOLENCE',
+  'HARM_CATEGORY_SEXUAL',
+  'HARM_CATEGORY_MEDICAL',
+  'HARM_CATEGORY_DANGEROUS',
+  'HARM_CATEGORY_HARASSMENT',
+  'HARM_CATEGORY_HATE_SPEECH',
+  'HARM_CATEGORY_SEXUALLY_EXPLICIT',
+  'HARM_CATEGORY_DANGEROUS_CONTENT',
+]);
+
+export const geminiBlockSafetyLevelSchema = z.enum([
+  'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  'BLOCK_LOW_AND_ABOVE',
+  'BLOCK_MEDIUM_AND_ABOVE',
+  'BLOCK_ONLY_HIGH',
+  'BLOCK_NONE',
+]);
+
+export type GeminiBlockSafetyLevel = z.infer<typeof geminiBlockSafetyLevelSchema>;
+
+const geminiSafetySettingSchema = z.object({
+  category: geminiHarmCategorySchema,
+  threshold: geminiBlockSafetyLevelSchema,
+});
+
+const geminiGenerationConfigSchema = z.object({
+  stopSequences: z.array(z.string()).optional(),
+  candidateCount: z.number().int().optional(),
+  maxOutputTokens: z.number().int().optional(),
+  temperature: z.number().optional(),
+  topP: z.number().optional(),
+  topK: z.number().int().optional(),
+});
+
+const geminiContentSchema = z.object({
+  // Must be either 'user' or 'model'. Optional but must be set if there are multiple "Content" objects in the parent array.
+  role: z.enum(['user', 'model']).optional(),
+  // Ordered Parts that constitute a single message. Parts may have different MIME types.
+  parts: z.array(geminiContentPartSchema),
+});
+
+export type GeminiContentSchema = z.infer<typeof geminiContentSchema>;
+
+export const geminiGenerateContentRequest = z.object({
+  contents: z.array(geminiContentSchema),
+  tools: z.array(geminiToolSchema).optional(),
+  safetySettings: z.array(geminiSafetySettingSchema).optional(),
+  generationConfig: geminiGenerationConfigSchema.optional(),
+});
+
+export type GeminiGenerateContentRequest = z.infer<typeof geminiGenerateContentRequest>;
+
+
+// Response
+
+const geminiHarmProbabilitySchema = z.enum([
+  'HARM_PROBABILITY_UNSPECIFIED',
+  'NEGLIGIBLE',
+  'LOW',
+  'MEDIUM',
+  'HIGH',
+]);
+
+const geminiSafetyRatingSchema = z.object({
+  'category': geminiHarmCategorySchema,
+  'probability': geminiHarmProbabilitySchema,
+  'blocked': z.boolean().optional(),
+});
+
+const geminiFinishReasonSchema = z.enum([
+  'FINISH_REASON_UNSPECIFIED',
+  'STOP',
+  'MAX_TOKENS',
+  'SAFETY',
+  'RECITATION',
+  'OTHER',
+]);
+
+export const geminiGeneratedContentResponseSchema = z.object({
+  // either all requested candidates are returned or no candidates at all
+  // no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
+  candidates: z.array(z.object({
+    index: z.number(),
+    content: geminiContentSchema,
+    finishReason: geminiFinishReasonSchema.optional(),
+    safetyRatings: z.array(geminiSafetyRatingSchema),
+    citationMetadata: z.object({
+      startIndex: z.number().optional(),
+      endIndex: z.number().optional(),
+      uri: z.string().optional(),
+      license: z.string().optional(),
+    }).optional(),
+    tokenCount: z.number().optional(),
+    // groundingAttributions: z.array(GroundingAttribution).optional(), // This field is populated for GenerateAnswer calls.
+  })).optional(),
+  // NOTE: promptFeedback is only send in the first chunk in a streaming response
+  promptFeedback: z.object({
+    blockReason: z.enum(['BLOCK_REASON_UNSPECIFIED', 'SAFETY', 'OTHER']).optional(),
+    safetyRatings: z.array(geminiSafetyRatingSchema).optional(),
+  }).optional(),
+});
@@ -4,12 +4,30 @@ import { createParser as createEventsourceParser, EventSourceParseCallback, Even

 import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE, serverFetchOrThrow } from '~/server/wire';

-import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
-import type { OpenAIWire } from './openai.wiretypes';
-import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
-import { ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
-import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
-import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
+
+// Anthropic server imports
+import type { AnthropicWire } from './anthropic/anthropic.wiretypes';
+import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from './anthropic/anthropic.router';
+
+// Gemini server imports
+import { geminiAccess, geminiAccessSchema, geminiGenerateContentTextPayload } from './gemini/gemini.router';
+import { geminiGeneratedContentResponseSchema, geminiModelsStreamGenerateContentPath } from './gemini/gemini.wiretypes';
+
+// Ollama server imports
+import { wireOllamaChunkedOutputSchema } from './ollama/ollama.wiretypes';
+import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from './ollama/ollama.router';
+
+// OpenAI server imports
+import type { OpenAIWire } from './openai/openai.wiretypes';
+import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
+
+
+/**
+ * Event stream formats
+ *  - 'sse' is the default format, and is used by all vendors except Ollama
+ *  - 'json-nl' is used by Ollama
+ */
+type MuxingFormat = 'sse' | 'json-nl';


 /**
@@ -20,77 +38,87 @@ import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
 * The peculiarity of our parser is the injection of a JSON structure at the beginning of the stream, to
 * communicate parameters before the text starts flowing to the client.
 */
-export type AIStreamParser = (data: string) => { text: string, close: boolean };
-
-type EventStreamFormat = 'sse' | 'json-nl';
+type AIStreamParser = (data: string) => { text: string, close: boolean };


-const chatStreamInputSchema = z.object({
-  access: z.union([anthropicAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
-  model: openAIModelSchema, history: openAIHistorySchema,
+const chatStreamingInputSchema = z.object({
+  access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
+  model: openAIModelSchema,
+  history: openAIHistorySchema,
 });
-export type ChatStreamInputSchema = z.infer<typeof chatStreamInputSchema>;
+export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;

-const chatStreamFirstPacketSchema = z.object({
+const chatStreamingFirstOutputPacketSchema = z.object({
  model: z.string(),
 });
-export type ChatStreamFirstPacketSchema = z.infer<typeof chatStreamFirstPacketSchema>;
+export type ChatStreamingFirstOutputPacketSchema = z.infer<typeof chatStreamingFirstOutputPacketSchema>;


-export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Response> {
+export async function llmStreamingRelayHandler(req: NextRequest): Promise<Response> {

  // inputs - reuse the tRPC schema
-  const { access, model, history } = chatStreamInputSchema.parse(await req.json());
+  const body = await req.json();
+  const { access, model, history } = chatStreamingInputSchema.parse(body);

-  // begin event streaming from the OpenAI API
-  let headersUrl: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
+  // access/dialect dependent setup:
+  //  - requestAccess: the headers and URL to use for the upstream API call
+  //  - muxingFormat: the format of the event stream (sse or json-nl)
+  //  - vendorStreamParser: the parser to use for the event stream
  let upstreamResponse: Response;
+  let requestAccess: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
+  let muxingFormat: MuxingFormat = 'sse';
  let vendorStreamParser: AIStreamParser;
-  let eventStreamFormat: EventStreamFormat = 'sse';
  try {

    // prepare the API request data
    let body: object;
    switch (access.dialect) {
      case 'anthropic':
-        headersUrl = anthropicAccess(access, '/v1/complete');
+        requestAccess = anthropicAccess(access, '/v1/complete');
        body = anthropicChatCompletionPayload(model, history, true);
-        vendorStreamParser = createAnthropicStreamParser();
+        vendorStreamParser = createStreamParserAnthropic();
+        break;
+
+      case 'gemini':
+        requestAccess = geminiAccess(access, model.id, geminiModelsStreamGenerateContentPath);
+        body = geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1);
+        vendorStreamParser = createStreamParserGemini(model.id.replace('models/', ''));
        break;

      case 'ollama':
-        headersUrl = ollamaAccess(access, '/api/generate');
+        requestAccess = ollamaAccess(access, OLLAMA_PATH_CHAT);
        body = ollamaChatCompletionPayload(model, history, true);
-        eventStreamFormat = 'json-nl';
-        vendorStreamParser = createOllamaStreamParser();
+        muxingFormat = 'json-nl';
+        vendorStreamParser = createStreamParserOllama();
        break;

      case 'azure':
      case 'localai':
+      case 'mistral':
      case 'oobabooga':
      case 'openai':
      case 'openrouter':
-        headersUrl = openAIAccess(access, model.id, '/v1/chat/completions');
+        requestAccess = openAIAccess(access, model.id, '/v1/chat/completions');
        body = openAIChatCompletionPayload(model, history, null, null, 1, true);
-        vendorStreamParser = createOpenAIStreamParser();
+        vendorStreamParser = createStreamParserOpenAI();
        break;
    }

    if (SERVER_DEBUG_WIRE)
-      console.log('-> streaming:', debugGenerateCurlCommand('POST', headersUrl.url, headersUrl.headers, body));
+      console.log('-> streaming:', debugGenerateCurlCommand('POST', requestAccess.url, requestAccess.headers, body));

    // POST to our API route
-    upstreamResponse = await serverFetchOrThrow(headersUrl.url, 'POST', headersUrl.headers, body);
+    upstreamResponse = await serverFetchOrThrow(requestAccess.url, 'POST', requestAccess.headers, body);

  } catch (error: any) {
    const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + error.cause : '');

    // server-side admins message
-    console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, headersUrl?.url);
+    console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, requestAccess?.url);

    // client-side users visible message
    return new NextResponse(`[Issue] ${access.dialect}: ${fetchOrVendorError}`
-      + (process.env.NODE_ENV === 'development' ? ` · [URL: ${headersUrl?.url}]` : ''), { status: 500 });
+      + (process.env.NODE_ENV === 'development' ? ` · [URL: ${requestAccess?.url}]` : ''), { status: 500 });
  }

  /* The following code is heavily inspired by the Vercel AI SDK, but simplified to our needs and in full control.
@@ -102,8 +130,12 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
   * NOTE: we have not benchmarked to see if there is performance impact by using this approach - we do want to have
   * a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
   */
-  const chatResponseStream = (upstreamResponse.body || createEmptyReadableStream())
-    .pipeThrough(createEventStreamTransformer(vendorStreamParser, eventStreamFormat, access.dialect));
+  const transformUpstreamToBigAgiClient = createEventStreamTransformer(
+    muxingFormat, vendorStreamParser, access.dialect,
+  );
+  const chatResponseStream =
+    (upstreamResponse.body || createEmptyReadableStream())
+      .pipeThrough(transformUpstreamToBigAgiClient);

  return new NextResponse(chatResponseStream, {
    status: 200,
@@ -114,105 +146,44 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
 }


-/// Event Parsers
-
-function createAnthropicStreamParser(): AIStreamParser {
-  let hasBegun = false;
-
-  return (data: string) => {
-
-    const json: AnthropicWire.Complete.Response = JSON.parse(data);
-    let text = json.completion;
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    return { text, close: false };
-  };
-}
-
-function createOllamaStreamParser(): AIStreamParser {
-  let hasBegun = false;
-
-  return (data: string) => {
-
-    let wireGeneration: any;
-    try {
-      wireGeneration = JSON.parse(data);
-    } catch (error: any) {
-      // log the malformed data to the console, and rethrow to transmit as 'error'
-      console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
-      throw error;
-    }
-    const generation = wireOllamaGenerationSchema.parse(wireGeneration);
-    let text = generation.response;
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: generation.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    return { text, close: generation.done };
-  };
-}
-
-function createOpenAIStreamParser(): AIStreamParser {
-  let hasBegun = false;
-  let hasWarned = false;
-
-  return (data: string) => {
-
-    const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
-
-    // [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
-    if (json.error)
-      return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
-
-    // [OpenAI] if there's a warning, log it once
-    if (json.warning && !hasWarned) {
-      hasWarned = true;
-      console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
-    }
-
-    if (json.choices.length !== 1) {
-      // [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
-      if (json.id === '' && json.object === '' && json.model === '')
-        return { text: '', close: false };
-      throw new Error(`Expected 1 completion, got ${json.choices.length}`);
-    }
-
-    const index = json.choices[0].index;
-    if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
-      throw new Error(`Expected completion index 0, got ${index}`);
-    let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
-
-    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
-      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
-      text = JSON.stringify(firstPacket) + text;
-    }
-
-    // [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
-    const close = !!json.choices[0].finish_reason;
-    return { text, close };
-  };
-}
-
-
 // Event Stream Transformers

+/**
+ * Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
+ * Ollama is the only vendor that uses this format.
+ */
+function createDemuxerJsonNewline(onParse: EventSourceParseCallback): EventSourceParser {
+  let accumulator: string = '';
+  return {
+    // feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
+    feed: (chunk: string): void => {
+      accumulator += chunk;
+      if (accumulator.endsWith('\n')) {
+        for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
+          const mimicEvent: ParsedEvent = {
+            type: 'event',
+            id: undefined,
+            event: undefined,
+            data: jsonString,
+          };
+          onParse(mimicEvent);
+        }
+        accumulator = '';
+      }
+    },
+
+    // resets the parser state - not useful with our driving of the parser
+    reset: (): void => {
+      console.error('createDemuxerJsonNewline.reset() not implemented');
+    },
+  };
+}
+
 /**
 * Creates a TransformStream that parses events from an EventSource stream using a custom parser.
 * @returns {TransformStream<Uint8Array, string>} TransformStream parsing events.
 */
-function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFormat: EventStreamFormat, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
+function createEventStreamTransformer(muxingFormat: MuxingFormat, vendorTextParser: AIStreamParser, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
  const textDecoder = new TextDecoder();
  const textEncoder = new TextEncoder();
  let eventSourceParser: EventSourceParser;
@@ -248,16 +219,17 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
          if (close)
            controller.terminate();
        } catch (error: any) {
-          // console.log(`/api/llms/stream: parse issue: ${error?.message || error}`);
-          controller.enqueue(textEncoder.encode(`[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}`));
+          if (SERVER_DEBUG_WIRE)
+            console.log(' - E: parse issue:', event.data, error?.message || error);
+          controller.enqueue(textEncoder.encode(` **[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}**`));
          controller.terminate();
        }
      };

-      if (inputFormat === 'sse')
+      if (muxingFormat === 'sse')
        eventSourceParser = createEventsourceParser(onNewEvent);
-      else if (inputFormat === 'json-nl')
-        eventSourceParser = createJsonNewlineParser(onNewEvent);
+      else if (muxingFormat === 'json-nl')
+        eventSourceParser = createDemuxerJsonNewline(onNewEvent);
    },

    // stream=true is set because the data is not guaranteed to be final and un-chunked
@@ -267,33 +239,142 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
  });
 }

-/**
- * Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
- * Ollama is the only vendor that uses this format.
- */
-function createJsonNewlineParser(onParse: EventSourceParseCallback): EventSourceParser {
-  let accumulator: string = '';
-  return {
-    // feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
-    feed: (chunk: string): void => {
-      accumulator += chunk;
-      if (accumulator.endsWith('\n')) {
-        for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
-          const mimicEvent: ParsedEvent = {
-            type: 'event',
-            id: undefined,
-            event: undefined,
-            data: jsonString,
-          };
-          onParse(mimicEvent);
-        }
-        accumulator = '';
-      }
-    },

-    // resets the parser state - not useful with our driving of the parser
-    reset: (): void => {
-      console.error('createJsonNewlineParser.reset() not implemented');
-    },
+/// Stream Parsers
+
+function createStreamParserAnthropic(): AIStreamParser {
+  let hasBegun = false;
+
+  return (data: string) => {
+
+    const json: AnthropicWire.Complete.Response = JSON.parse(data);
+    let text = json.completion;
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: false };
  };
 }
+
+function createStreamParserGemini(modelName: string): AIStreamParser {
+  let hasBegun = false;
+
+  // this can throw, it's catched upstream
+  return (data: string) => {
+
+    // parse the JSON chunk
+    const wireGenerationChunk = JSON.parse(data);
+    const generationChunk = geminiGeneratedContentResponseSchema.parse(wireGenerationChunk);
+
+    // Prompt Safety Errors: pass through errors from Gemini
+    if (generationChunk.promptFeedback?.blockReason) {
+      const { blockReason, safetyRatings } = generationChunk.promptFeedback;
+      return { text: `[Gemini Prompt Blocked] ${blockReason}: ${JSON.stringify(safetyRatings || 'Unknown Safety Ratings', null, 2)}`, close: true };
+    }
+
+    // expect a single completion
+    const singleCandidate = generationChunk.candidates?.[0] ?? null;
+    if (!singleCandidate || !singleCandidate.content?.parts.length)
+      throw new Error(`Gemini: expected 1 completion, got ${generationChunk.candidates?.length}`);
+
+    // expect a single part
+    if (singleCandidate.content.parts.length !== 1 || !('text' in singleCandidate.content.parts[0]))
+      throw new Error(`Gemini: expected 1 text part, got ${singleCandidate.content.parts.length}`);
+
+    // expect a single text in the part
+    let text = singleCandidate.content.parts[0].text || '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: modelName };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: false };
+  };
+}
+
+function createStreamParserOllama(): AIStreamParser {
+  let hasBegun = false;
+
+  return (data: string) => {
+
+    // parse the JSON chunk
+    let wireJsonChunk: any;
+    try {
+      wireJsonChunk = JSON.parse(data);
+    } catch (error: any) {
+      // log the malformed data to the console, and rethrow to transmit as 'error'
+      console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
+      throw error;
+    }
+
+    // validate chunk
+    const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
+
+    // pass through errors from Ollama
+    if ('error' in chunk)
+      throw new Error(chunk.error);
+
+    // process output
+    let text = chunk.message?.content || /*chunk.response ||*/ '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun && chunk.model) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: chunk.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    return { text, close: chunk.done };
+  };
+}
+
+function createStreamParserOpenAI(): AIStreamParser {
+  let hasBegun = false;
+  let hasWarned = false;
+
+  return (data: string) => {
+
+    const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
+
+    // [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
+    if (json.error)
+      return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
+
+    // [OpenAI] if there's a warning, log it once
+    if (json.warning && !hasWarned) {
+      hasWarned = true;
+      console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
+    }
+
+    if (json.choices.length !== 1) {
+      // [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
+      if (json.id === '' && json.object === '' && json.model === '')
+        return { text: '', close: false };
+      throw new Error(`Expected 1 completion, got ${json.choices.length}`);
+    }
+
+    const index = json.choices[0].index;
+    if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
+      throw new Error(`Expected completion index 0, got ${index}`);
+    let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
+
+    // hack: prepend the model name to the first packet
+    if (!hasBegun) {
+      hasBegun = true;
+      const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
+      text = JSON.stringify(firstPacket) + text;
+    }
+
+    // [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
+    const close = !!json.choices[0].finish_reason;
+    return { text, close };
+  };
+}
@@ -1,11 +1,18 @@
 import { z } from 'zod';
-import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../store-llms';
+
+
+// Model Description: a superset of LLM model descriptors

 const pricingSchema = z.object({
  cpmPrompt: z.number().optional(), // Cost per thousand prompt tokens
  cpmCompletion: z.number().optional(), // Cost per thousand completion tokens
 });

+// const rateLimitsSchema = z.object({
+//   reqPerMinute: z.number().optional(),
+// });
+
 const modelDescriptionSchema = z.object({
  id: z.string(),
  label: z.string(),
@@ -15,9 +22,12 @@ const modelDescriptionSchema = z.object({
  contextWindow: z.number(),
  maxCompletionTokens: z.number().optional(),
  pricing: pricingSchema.optional(),
+  // rateLimits: rateLimitsSchema.optional(),
  interfaces: z.array(z.enum([LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Complete, LLM_IF_OAI_Vision])),
  hidden: z.boolean().optional(),
 });
+
+// this is also used by the Client
 export type ModelDescriptionSchema = z.infer<typeof modelDescriptionSchema>;

 export const listModelsOutputSchema = z.object({
@@ -3,54 +3,62 @@
 * descriptions for the models.
 * (nor does it reliably provide context window sizes) - TODO: open a bug upstream
 *
- * from: https://ollama.ai/library?sort=popular
+ * from: https://ollama.ai/library?sort=featured
 */
 export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
-  'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 56100 },
-  'llama2': { description: 'The most popular model for general use.', pulls: 117400 },
-  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 61500 },
-  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 26800 },
-  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 23000 },
-  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 20600 },
-  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 12100 },
-  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 9760 },
-  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9002 },
-  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 8671 },
-  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8478 },
-  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 8142 },
-  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7426 },
-  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7035 },
-  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6140 },
-  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 5865 },
-  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5534 },
-  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 4696 },
-  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4275 },
-  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4227 },
-  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 3663, added: '20231129' },
-  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3343 },
-  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 3134, added: '20231129' },
-  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3050 },
-  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 2981 },
-  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 2636 },
-  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2328 },
-  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 2281, added: '20231129' },
-  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 2101 },
-  'yi': { description: 'A high-performing, bilingual base model.', pulls: 1806 },
-  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 1803 },
-  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1605 },
-  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks.', pulls: 1584 },
-  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1560 },
-  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 1338, added: '20231129' },
-  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1253 },
-  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1163 },
-  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1099 },
-  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1042 },
-  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 728, added: '20231129' },
-  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 593 },
-  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 585 },
-  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 573, added: '20231129' },
-  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 446, added: '20231129' },
-  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 100, added: '20231129' },
-  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 11, added: '20231129' },
+  'llama2': { description: 'The most popular model for general use.', pulls: 165600 },
+  'mistral': { description: 'The 7B model released by Mistral AI, updated to version 0.2', pulls: 92200 },
+  'llava': { description: '🌋 A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding.', pulls: 3563, added: '20231215' },
+  'mixtral': { description: 'A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.', pulls: 8277, added: '20231215' },
+  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 3657, added: '20231129' },
+  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 4647, added: '20231129' },
+  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 79800 },
+  'dolphin-mixtral': { description: 'An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks. Created by Eric Hartford.', pulls: 48400, added: '20231215' },
+  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 36600 },
+  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 30000 },
+  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 22700 },
+  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 15300 },
+  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 11500 },
+  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 11200 },
+  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 10700 },
+  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 10200 },
+  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9842 },
+  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 9071 },
+  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 8328 },
+  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 8111 },
+  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 7492, added: '20231129' },
+  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 7468 },
+  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6468 },
+  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 6397 },
+  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5347 },
+  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 5034 },
+  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4874 },
+  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 4686 },
+  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-1210.', pulls: 4496, added: '20231129' },
+  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 4331 },
+  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3722 },
+  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3668 },
+  'yi': { description: 'A high-performing, bilingual base model.', pulls: 3335 },
+  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3219 },
+  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 3087 },
+  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2518 },
+  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 2338 },
+  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 2216, added: '20231129' },
+  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 2201 },
+  'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 1983, added: '20231210' },
+  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1790 },
+  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1732, added: '20231129' },
+  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1598 },
+  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1534 },
+  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1454 },
+  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1418 },
+  'phi': { description: 'Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.', pulls: 1304, added: '20231220' },
+  'bakllava': { description: 'BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.', pulls: 1189, added: '20231215' },
+  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 1140, added: '20231129' },
+  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 1060 },
+  'solar': { description: 'A compact, yet powerful 10.7B large language model designed for single-turn conversation.', pulls: 934 },
+  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 902, added: '20231129' },
+  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 868 },
 };
-export const OLLAMA_LAST_UPDATE: string = '20231129';
+// export const OLLAMA_LAST_UPDATE: string = '20231220';
+export const OLLAMA_PREV_UPDATE: string = '20231210';
@@ -1,22 +1,26 @@
 import { z } from 'zod';
+import { TRPCError } from '@trpc/server';

 import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
 import { fetchJsonOrTRPCError, fetchTextOrTRPCError } from '~/server/api/trpc.serverutils';

-import { LLM_IF_OAI_Chat } from '../../../store-llms';
+import { LLM_IF_OAI_Chat } from '../../store-llms';

 import { capitalizeFirstLetter } from '~/common/util/textUtils';

 import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
-import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';

-import { OLLAMA_BASE_MODELS, OLLAMA_LAST_UPDATE } from './ollama.models';
-import { wireOllamaGenerationSchema } from './ollama.wiretypes';
+import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
+import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';


 // Default hosts
 const DEFAULT_OLLAMA_HOST = 'http://127.0.0.1:11434';
+export const OLLAMA_PATH_CHAT = '/api/chat';
+const OLLAMA_PATH_TAGS = '/api/tags';
+const OLLAMA_PATH_SHOW = '/api/show';


 // Mappers
@@ -34,7 +38,23 @@ export function ollamaAccess(access: OllamaAccessSchema, apiPath: string): { hea

 }

-export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
+
+export const ollamaChatCompletionPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean): WireOllamaChatCompletionInput => ({
+  model: model.id,
+  messages: history,
+  options: {
+    ...(model.temperature && { temperature: model.temperature }),
+  },
+  // n: ...
+  // functions: ...
+  // function_call: ...
+  stream,
+});
+
+
+/* Unused: switched to the Chat endpoint (above). The implementation is left here for reference.
+https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion
+export function ollamaCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {

  // if the first message is the system prompt, extract it
  let systemPrompt: string | undefined = undefined;
@@ -62,7 +82,7 @@ export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: O
    ...(systemPrompt && { system: systemPrompt }),
    stream,
  };
-}
+}*/

 async function ollamaGET<TOut extends object>(access: OllamaAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
  const { headers, url } = ollamaAccess(access, apiPath);
@@ -104,6 +124,7 @@ const listPullableOutputSchema = z.object({
    label: z.string(),
    tag: z.string(),
    description: z.string(),
+    pulls: z.number(),
    isNew: z.boolean(),
  })),
 });
@@ -122,7 +143,8 @@ export const llmOllamaRouter = createTRPCRouter({
          label: capitalizeFirstLetter(model_id),
          tag: 'latest',
          description: model.description,
-          isNew: !!model.added && model.added >= OLLAMA_LAST_UPDATE,
+          pulls: model.pulls,
+          isNew: !!model.added && model.added >= OLLAMA_PREV_UPDATE,
        })),
      };
    }),
@@ -160,6 +182,7 @@ export const llmOllamaRouter = createTRPCRouter({
        throw new Error('Ollama delete issue: ' + deleteOutput);
    }),

+
  /* Ollama: List the Models available */
  listModels: publicProcedure
    .input(accessOnlySchema)
@@ -167,7 +190,7 @@ export const llmOllamaRouter = createTRPCRouter({
    .query(async ({ input }) => {

      // get the models
-      const wireModels = await ollamaGET(input.access, '/api/tags');
+      const wireModels = await ollamaGET(input.access, OLLAMA_PATH_TAGS);
      const wireOllamaListModelsSchema = z.object({
        models: z.array(z.object({
          name: z.string(),
@@ -180,7 +203,7 @@ export const llmOllamaRouter = createTRPCRouter({

      // retrieve info for each of the models (/api/show, post call, in parallel)
      const detailedModels = await Promise.all(models.map(async model => {
-        const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, '/api/show');
+        const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, OLLAMA_PATH_SHOW);
        const wireOllamaModelInfoSchema = z.object({
          license: z.string().optional(),
          modelfile: z.string(),
@@ -221,12 +244,24 @@ export const llmOllamaRouter = createTRPCRouter({
    .output(openAIChatGenerateOutputSchema)
    .mutation(async ({ input: { access, history, model } }) => {

-      const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), '/api/generate');
-      const generation = wireOllamaGenerationSchema.parse(wireGeneration);
+      const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), OLLAMA_PATH_CHAT);
+      const generation = wireOllamaChunkedOutputSchema.parse(wireGeneration);
+
+      if ('error' in generation)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Ollama chat-generation issue: ${generation.error}`,
+        });
+
+      if (!generation.message?.content)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Ollama chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });

      return {
        role: 'assistant',
-        content: generation.response,
+        content: generation.message.content,
        finish_reason: generation.done ? 'stop' : null,
      };
    }),
@@ -0,0 +1,76 @@
+import { z } from 'zod';
+
+
+/**
+ * Chat Completion API - Request
+ * https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion
+ */
+const wireOllamaChatCompletionInputSchema = z.object({
+
+  // required
+  model: z.string(),
+  messages: z.array(z.object({
+    role: z.enum(['assistant', 'system', 'user']),
+    content: z.string(),
+  })),
+
+  // optional
+  format: z.enum(['json']).optional(),
+  options: z.object({
+    // https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md
+    // Maximum number of tokens to predict when generating text.
+    num_predict: z.number().int().optional(),
+    // Sets the random number seed to use for generation
+    seed: z.number().int().optional(),
+    // The temperature of the model
+    temperature: z.number().positive().optional(),
+    // Reduces the probability of generating nonsense (Default: 40)
+    top_k: z.number().positive().optional(),
+    // Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text. (Default 0.9)
+    top_p: z.number().positive().optional(),
+  }).optional(),
+  template: z.string().optional(), // overrides what is defined in the Modelfile
+  stream: z.boolean().optional(), // default: true
+
+  // Future Improvements?
+  // n: z.number().int().optional(), // number of completions to generate
+  // functions: ...
+  // function_call: ...
+});
+export type WireOllamaChatCompletionInput = z.infer<typeof wireOllamaChatCompletionInputSchema>;
+
+
+/**
+ * Chat Completion or Generation APIs - Streaming Response
+ */
+export const wireOllamaChunkedOutputSchema = z.union([
+  // Chat Completion Chunk
+  z.object({
+    model: z.string(),
+    // created_at: z.string(), // commented because unused
+
+    // [Chat Completion] (exclusive with 'response')
+    message: z.object({
+      role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
+      content: z.string(),
+    }).optional(), // optional on the last message
+
+    // [Generation] (non-chat, exclusive with 'message')
+    //response: z.string().optional(),
+
+    done: z.boolean(),
+
+    // only on the last message
+    // context: z.array(z.number()), // non-chat endpoint
+    // total_duration: z.number(),
+    // prompt_eval_count: z.number(),
+    // prompt_eval_duration: z.number(),
+    // eval_count: z.number(),
+    // eval_duration: z.number(),
+
+  }),
+  // Possible Error
+  z.object({
+    error: z.string(),
+  }),
+]);
@@ -0,0 +1,33 @@
+import { z } from 'zod';
+
+
+// [Mistral] Models List API - Response
+
+export const wireMistralModelsListOutputSchema = z.object({
+  id: z.string(),
+  object: z.literal('model'),
+  created: z.number(),
+  owned_by: z.string(),
+  root: z.null().optional(),
+  parent: z.null().optional(),
+  // permission: z.array(wireMistralModelsListPermissionsSchema)
+});
+
+// export type WireMistralModelsListOutput = z.infer<typeof wireMistralModelsListOutputSchema>;
+
+/*
+const wireMistralModelsListPermissionsSchema = z.object({
+  id: z.string(),
+  object: z.literal('model_permission'),
+  created: z.number(),
+  allow_create_engine: z.boolean(),
+  allow_sampling: z.boolean(),
+  allow_logprobs: z.boolean(),
+  allow_search_indices: z.boolean(),
+  allow_view: z.boolean(),
+  allow_fine_tuning: z.boolean(),
+  organization: z.string(),
+  group: z.null().optional(),
+  is_blocking: z.boolean()
+});
+*/
@@ -1,5 +1,9 @@
-import type { ModelDescriptionSchema } from '../server.schemas';
-import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
+import { SERVER_DEBUG_WIRE } from '~/server/wire';
+
+import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
+
+import type { ModelDescriptionSchema } from '../llm.server.types';
+import { wireMistralModelsListOutputSchema } from './mistral.wiretypes';


 // [Azure] / [OpenAI]
@@ -203,6 +207,63 @@ export function localAIModelToModelDescription(modelId: string): ModelDescriptio
 }


+// [Mistral]
+
+const _knownMistralChatModels: ManualMappings = [
+  {
+    idPrefix: 'mistral-medium',
+    label: 'Mistral Medium',
+    description: 'Mistral internal prototype model.',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+  },
+  {
+    idPrefix: 'mistral-small',
+    label: 'Mistral Small',
+    description: 'Higher reasoning capabilities and more capabilities (English, French, German, Italian, Spanish, and Code)',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+  },
+  {
+    idPrefix: 'mistral-tiny',
+    label: 'Mistral Tiny',
+    description: 'Used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat],
+  },
+  {
+    idPrefix: 'mistral-embed',
+    label: 'Mistral Embed',
+    description: 'Mistral Medium on Mistral',
+    // output: 1024 dimensions
+    maxCompletionTokens: 1024, // HACK - it's 1024 dimensions, but those are not 'completion tokens'
+    contextWindow: 32768, // actually unknown, assumed from the other models
+    interfaces: [],
+    hidden: true,
+  },
+];
+
+export function mistralModelToModelDescription(_model: unknown): ModelDescriptionSchema {
+  const model = wireMistralModelsListOutputSchema.parse(_model);
+  return fromManualMapping(_knownMistralChatModels, model.id, model.created, undefined, {
+    idPrefix: model.id,
+    label: model.id.replaceAll(/[_-]/g, ' '),
+    description: 'New Mistral Model',
+    contextWindow: 32768,
+    interfaces: [LLM_IF_OAI_Chat], // assume..
+    hidden: true,
+  });
+}
+
+export function mistralModelsSort(a: ModelDescriptionSchema, b: ModelDescriptionSchema): number {
+  if (a.hidden && !b.hidden)
+    return 1;
+  if (!a.hidden && b.hidden)
+    return -1;
+  return a.id.localeCompare(b.id);
+}
+
+
 // [Oobabooga]
 const _knownOobaboogaChatModels: ManualMappings = [];

@@ -236,8 +297,8 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
 /**
 * Created to reflect the doc page: https://openrouter.ai/docs
 *
- * Update prompt:
- *   "Please update the typescript object below (do not change the definition, just the object), based on the updated upstream documentation:"
+ * Update prompt (last updated 2023-12-12)
+ *   "Please update the following typescript object (do not change the definition, just values, and do not miss any rows), based on the information provided thereafter:"
 *
 * fields:
 *  - cw: context window size (max tokens, total)
@@ -247,19 +308,24 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
 */
 const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?: number; old?: boolean; unfilt?: boolean; } } = {
  // 'openrouter/auto': { name: 'Auto (best for prompt)', cw: 128000, cp: undefined, cc: undefined, unfilt: undefined },
-  'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
-  'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B (beta)', cw: 4096, cp: 0, cc: 0, unfilt: true },
-  'openchat/openchat-7b': { name: 'OpenChat 7B (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
-  'undi95/toppy-m-7b': { name: 'Toppy M 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
-  'gryphe/mythomist-7b': { name: 'MythoMist 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
-  'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B (beta)', cw: 4096, cp: 0.000155, cc: 0.000155, unfilt: true },
-  'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct (beta)', cw: 8192, cp: 0.00045, cc: 0.00045, unfilt: true },
-  'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2 (beta)', cw: 4096, cp: 0.00045, cc: 0.00045, unfilt: true },
-  'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1 (beta)', cw: 32768, cp: 0.005, cc: 0.005, unfilt: true },
-  'haotian-liu/llava-13b': { name: 'Llava 13B (beta)', cw: 2048, cp: 0.005, cc: 0.005, unfilt: true },
-  'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat (beta)', cw: 4096, cp: 0.000234533, cc: 0.000234533, unfilt: true },
-  'alpindale/goliath-120b': { name: 'Goliath 120B (beta)', cw: 6144, cp: 0.00703125, cc: 0.00703125, unfilt: true },
-  'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B (beta)', cw: 4096, cp: 0.000562, cc: 0.000762, unfilt: true },
+  'nousresearch/nous-capybara-7b': { name: 'Nous: Capybara 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
+  'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct', cw: 8192, cp: 0, cc: 0, unfilt: true },
+  'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
+  'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
+  'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
+  'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32768, cp: 0, cc: 0, unfilt: true },
+  'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
+  'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
+  'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
+  'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
+  'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
+  'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32768, cp: 0.0003, cc: 0.0003, unfilt: true },
+  'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
  'openai/gpt-3.5-turbo': { name: 'OpenAI: GPT-3.5 Turbo', cw: 4095, cp: 0.001, cc: 0.002, unfilt: false },
  'openai/gpt-3.5-turbo-1106': { name: 'OpenAI: GPT-3.5 Turbo 16k (preview)', cw: 16385, cp: 0.001, cc: 0.002, unfilt: false },
  'openai/gpt-3.5-turbo-16k': { name: 'OpenAI: GPT-3.5 Turbo 16k', cw: 16385, cp: 0.003, cc: 0.004, unfilt: false },
@@ -268,28 +334,44 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'openai/gpt-4-32k': { name: 'OpenAI: GPT-4 32k', cw: 32767, cp: 0.06, cc: 0.12, unfilt: false },
  'openai/gpt-4-vision-preview': { name: 'OpenAI: GPT-4 Vision (preview)', cw: 128000, cp: 0.01, cc: 0.03, unfilt: false },
  'openai/gpt-3.5-turbo-instruct': { name: 'OpenAI: GPT-3.5 Turbo Instruct', cw: 4095, cp: 0.0015, cc: 0.002, unfilt: false },
-  'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 9216, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
-  'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B (beta)', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
-  'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B (beta)', cw: 32000, cp: 0.02, cc: 0.02, unfilt: true },
-  'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
-  'migtissera/synthia-70b': { name: 'Synthia 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
-  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B (beta)', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B (beta)', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
-  'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'neversleep/noromaid-20b': { name: 'Noromaid 20B (beta)', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
+  'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 36864, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 28672, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/gemini-pro': { name: 'Google: Gemini Pro (preview)', cw: 131040, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'google/gemini-pro-vision': { name: 'Google: Gemini Pro Vision (preview)', cw: 65536, cp: 0.00025, cc: 0.0005, unfilt: true },
+  'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
+  'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
+  'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
+  'perplexity/pplx-70b-chat': { name: 'Perplexity: PPLX 70B Chat', cw: 4096, cp: 0.0007, cc: 0.0028, unfilt: true },
+  'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
+  'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
+  'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
+  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0001425006, cc: 0.0001425006, unfilt: true },
+  'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
+  'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'undi95/toppy-m-7b': { name: 'Toppy M 7B', cw: 32768, cp: 0.000375, cc: 0.000375, unfilt: true },
+  'alpindale/goliath-120b': { name: 'Goliath 120B', cw: 6144, cp: 0.009375, cc: 0.009375, unfilt: true },
+  'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'neversleep/noromaid-20b': { name: 'Noromaid 20B', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
+  '01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
+  '01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
+  '01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
+  'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32768, cp: 0.0006, cc: 0.0006, unfilt: true },
  'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
-  'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.0045, cc: 0.0045, unfilt: true },
+  'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.003375, cc: 0.003375, unfilt: true },
  'gryphe/mythomax-l2-13b': { name: 'MythoMax 13B', cw: 4096, cp: 0.0006, cc: 0.0006, unfilt: true },
+  // Old models (maintained for reference)
  'openai/gpt-3.5-turbo-0301': { name: 'OpenAI: GPT-3.5 Turbo (older v0301)', cw: 4095, cp: 0.001, cc: 0.002, old: true },
  'openai/gpt-4-0314': { name: 'OpenAI: GPT-4 (older v0314)', cw: 8191, cp: 0.03, cc: 0.06, old: true },
  'openai/gpt-4-32k-0314': { name: 'OpenAI: GPT-4 32k (older v0314)', cw: 32767, cp: 0.06, cc: 0.12, old: true },
@@ -301,7 +383,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'anthropic/claude-instant-1.0': { name: 'Anthropic: Claude Instant (older v1)', cw: 9000, cp: 0.00163, cc: 0.00551, old: true },
 };

-const orModelFamilyOrder = ['mistralai/', 'huggingfaceh4/', 'undi95/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/', 'openrouter/'];
+const orModelFamilyOrder = [
+  // great models (pickes by hand, they're free)
+  'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
+  // great orgs
+  'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'mistralai/', 'openai/', 'meta-llama/', 'phind/',
+];

 export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
  const aPrefixIndex = orModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
@@ -321,10 +408,10 @@ export function openRouterModelToModelDescription(modelId: string, created: numb
  const orModel = orModelMap[modelId] ?? null;
  let label = orModel?.name || modelId.replace('/', ' · ');
  if (orModel?.cp === 0 && orModel?.cc === 0)
-    label += ' - 🎁 Free';
+    label += ' · 🎁'; // Free? Discounted?

-  // if (!orModel)
-  //   console.log('openRouterModelToModelDescription: unknown model id:', modelId);
+  if (SERVER_DEBUG_WIRE && !orModel)
+    console.log(' - openRouterModelToModelDescription: non-mapped model id:', modelId);

  // context: use the known size if available, otherwise fallback to the (undocumneted) provided length or fallback again to 4096
  const contextWindow = orModel?.cw || context_length || 4096;
@@ -8,13 +8,13 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
 import { Brand } from '~/common/app.config';

 import type { OpenAIWire } from './openai.wiretypes';
-import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
-import { localAIModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
+import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
+import { localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';


 // Input Schemas

-const openAIDialects = z.enum(['azure', 'localai', 'oobabooga', 'openai', 'openrouter']);
+const openAIDialects = z.enum(['azure', 'localai', 'mistral', 'oobabooga', 'openai', 'openrouter']);

 export const openAIAccessSchema = z.object({
  dialect: openAIDialects,
@@ -186,12 +186,18 @@ export const llmOpenAIRouter = createTRPCRouter({
            .map((model): ModelDescriptionSchema => openAIModelToModelDescription(model.id, model.created));
          break;

+        case 'mistral':
+          models = openAIModels
+            .map(mistralModelToModelDescription)
+            .sort(mistralModelsSort);
+          break;

        case 'openrouter':
          models = openAIModels
            .sort(openRouterModelFamilySortFn)
            .map(model => openRouterModelToModelDescription(model.id, model.created, (model as any)?.['context_length']));
          break;
+
      }

      return { models };
@@ -267,9 +273,10 @@ async function openaiPOST<TOut extends object, TPostBody extends object>(access:
 }


+const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';
+const DEFAULT_MISTRAL_HOST = 'https://api.mistral.ai';
 const DEFAULT_OPENAI_HOST = 'api.openai.com';
 const DEFAULT_OPENROUTER_HOST = 'https://openrouter.ai/api';
-const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';

 export function fixupHost(host: string, apiPath: string): string {
  if (!host.startsWith('http'))
@@ -361,6 +368,20 @@ export function openAIAccess(access: OpenAIAccessSchema, modelRefId: string | nu
      };


+    case 'mistral':
+      // https://docs.mistral.ai/platform/client
+      const mistralKey = access.oaiKey || env.MISTRAL_API_KEY || '';
+      const mistralHost = fixupHost(access.oaiHost || DEFAULT_MISTRAL_HOST, apiPath);
+      return {
+        headers: {
+          'Content-Type': 'application/json',
+          'Accept': 'application/json',
+          'Authorization': `Bearer ${mistralKey}`,
+        },
+        url: mistralHost + apiPath,
+      };
+
+
    case 'openrouter':
      const orKey = access.oaiKey || env.OPENROUTER_API_KEY || '';
      const orHost = fixupHost(access.oaiHost || DEFAULT_OPENROUTER_HOST, apiPath);
@@ -2,7 +2,8 @@ import { create } from 'zustand';
 import { shallow } from 'zustand/shallow';
 import { persist } from 'zustand/middleware';

-import { ModelVendorId } from './vendors/IModelVendor';
+import type { ModelVendorId } from './vendors/vendors.registry';
+import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';


 /**
@@ -15,6 +16,7 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
  updated?: number | 0;
  description: string;
  tags: string[]; // UNUSED for now
+  // modelcaps: DModelCapability[];
  contextTokens: number;
  maxOutputTokens: number;
  hidden: boolean;
@@ -29,6 +31,17 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {

 export type DLLMId = string;

+// export type DModelCapability =
+//   | 'input-text'
+//   | 'input-image-data'
+//   | 'input-multipart'
+//   | 'output-text'
+//   | 'output-function'
+//   | 'output-image-data'
+//   | 'if-chat'
+//   | 'if-fast-chat'
+//   ;
+
 // Model interfaces (chat, and function calls) - here as a preview, will be used more broadly in the future
 export const LLM_IF_OAI_Chat = 'oai-chat';
 export const LLM_IF_OAI_Vision = 'oai-vision';
@@ -76,6 +89,9 @@ interface ModelsActions {
  setChatLLMId: (id: DLLMId | null) => void;
  setFastLLMId: (id: DLLMId | null) => void;
  setFuncLLMId: (id: DLLMId | null) => void;
+
+  // special
+  setOpenRoutersKey: (key: string) => void;
 }

 type LlmsStore = ModelsData & ModelsActions;
@@ -162,13 +178,22 @@ export const useModelsStore = create<LlmsStore>()(
        set(state => ({
          sources: state.sources.map((source: DModelSource): DModelSource =>
            source.id === id
-              ? {
-                ...source,
-                setup: { ...source.setup, ...partialSetup },
-              } : source,
+              ? { ...source, setup: { ...source.setup, ...partialSetup } }
+              : source,
          ),
        })),

+      setOpenRoutersKey: (key: string) =>
+        set(state => {
+          const openRouterSource = state.sources.find(source => source.vId === 'openrouter');
+          if (!openRouterSource) return state;
+          return {
+            sources: state.sources.map(source => source.id === openRouterSource.id
+              ? { ...source, setup: { ...source.setup, oaiKey: key satisfies SourceSetupOpenRouter['oaiKey'] } }
+              : source),
+          };
+        }),
+
    }),
    {
      name: 'app-models',
@@ -256,24 +281,3 @@ export function useChatLLM() {
  }, shallow);
 }

-/**
- * Source-specific read/write - great time saver
- */
-export function useSourceSetup<TSourceSetup, TAccess>(sourceId: DModelSourceId, getAccess: (partialSetup?: Partial<TSourceSetup>) => TAccess) {
-  // invalidate when the setup changes
-  const { updateSourceSetup, ...rest } = useModelsStore(state => {
-    const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) ?? null;
-    const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
-    return {
-      source,
-      sourceLLMs,
-      sourceHasLLMs: !!sourceLLMs.length,
-      access: getAccess(source?.setup),
-      updateSourceSetup: state.updateSourceSetup,
-    };
-  }, shallow);
-
-  // convenience function for this source
-  const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
-  return { ...rest, updateSetup };
-}
@@ -1,34 +0,0 @@
-import type { DLLMId } from '../store-llms';
-import type { OpenAIWire } from './server/openai/openai.wiretypes';
-import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
-
-
-export interface VChatMessageIn {
-  role: 'assistant' | 'system' | 'user'; // | 'function';
-  content: string;
-  //name?: string; // when role: 'function'
-}
-
-export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
-
-export interface VChatMessageOut {
-  role: 'assistant' | 'system' | 'user';
-  content: string;
-  finish_reason: 'stop' | 'length' | null;
-}
-
-export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
-  function_name: string;
-  function_arguments: object | null;
-}
-
-
-export async function callChatGenerate(llmId: DLLMId, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  return await vendor.callChatGenerate(llm, messages, maxTokens);
-}
-
-export async function callChatGenerateWithFunctions(llmId: DLLMId, messages: VChatMessageIn[], functions: VChatFunctionIn[], forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  return await vendor.callChatGenerateWF(llm, messages, functions, forceFunctionName, maxTokens);
-}
@@ -1,16 +0,0 @@
-import { z } from 'zod';
-
-export const wireOllamaGenerationSchema = z.object({
-  model: z.string(),
-  // created_at: z.string(), // commented because unused
-  response: z.string(),
-  done: z.boolean(),
-
-  // only on the last message
-  // context: z.array(z.number()),
-  // total_duration: z.number(),
-  // load_duration: z.number(),
-  // eval_duration: z.number(),
-  // prompt_eval_count: z.number(),
-  // eval_count: z.number(),
-});
@@ -1,18 +1,19 @@
 import type React from 'react';
+import type { TRPCClientErrorBase } from '@trpc/client';

-import type { DLLM, DModelSourceId } from '../store-llms';
-import { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../transports/chatGenerate';
+import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
+import type { ModelDescriptionSchema } from '../server/llm.server.types';
+import type { ModelVendorId } from './vendors.registry';
+import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '~/modules/llms/llm.client';


-export type ModelVendorId = 'anthropic' | 'azure' | 'localai' | 'ollama' | 'oobabooga' | 'openai' | 'openrouter';
-
-
-export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAccess = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
+export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
  readonly id: ModelVendorId;
  readonly name: string;
  readonly rank: number;
  readonly location: 'local' | 'cloud';
  readonly instanceLimit: number;
+  readonly hasFreeModels?: boolean;
  readonly hasBackendCap?: () => boolean;

  // components
@@ -20,12 +21,36 @@ export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAc
  readonly SourceSetupComponent: React.ComponentType<{ sourceId: DModelSourceId }>;
  readonly LLMOptionsComponent: React.ComponentType<{ llm: TDLLM }>;

-  // functions
-  readonly initializeSetup?: () => TSourceSetup;
+  /// abstraction interface ///

-  getAccess(setup?: Partial<TSourceSetup>): TAccess;
+  initializeSetup?(): TSourceSetup;

-  callChatGenerate(llm: TDLLM, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut>;
+  validateSetup?(setup: TSourceSetup): boolean;

-  callChatGenerateWF(llm: TDLLM, messages: VChatMessageIn[], functions: null | VChatFunctionIn[], forceFunctionName: null | string, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut>;
-}
+  getTransportAccess(setup?: Partial<TSourceSetup>): TAccess;
+
+  rpcUpdateModelsQuery: (
+    access: TAccess,
+    enabled: boolean,
+    onSuccess: (data: { models: ModelDescriptionSchema[] }) => void,
+  ) => { isFetching: boolean, refetch: () => void, isError: boolean, error: TRPCClientErrorBase<any> | null };
+
+  rpcChatGenerateOrThrow: (
+    access: TAccess,
+    llmOptions: TLLMOptions,
+    messages: VChatMessageIn[],
+    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+    maxTokens?: number,
+  ) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
+
+  streamingChatGenerateOrThrow: (
+    access: TAccess,
+    llmId: DLLMId,
+    llmOptions: TLLMOptions,
+    messages: VChatMessageIn[],
+    functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
+    abortSignal: AbortSignal,
+    onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
+  ) => Promise<void>;
+
+}
@@ -7,11 +7,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidAnthropicApiKey, ModelVendorAnthropic } from './anthropic.vendor';

@@ -23,7 +23,7 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, sourceHasLLMs, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorAnthropic.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorAnthropic);

  // derived state
  const { anthropicKey, anthropicHost, heliconeKey } = access;
@@ -34,14 +34,8 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = anthropicKey ? keyValid : (!needsUserKey || !!anthropicHost);

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmAnthropic.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorAnthropic, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -1,11 +1,12 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { AnthropicIcon } from '~/common/components/icons/AnthropicIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

+import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
 import type { IModelVendor } from '../IModelVendor';
-import type { AnthropicAccessSchema } from '../../transports/server/anthropic/anthropic.router';
-import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { LLMOptionsOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
@@ -14,7 +15,7 @@ import { AnthropicSourceSetup } from './AnthropicSourceSetup';


 // special symbols
-export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length > 40 : apiKey.length >= 40);
+export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length >= 39 : apiKey.length >= 40);

 export interface SourceSetupAnthropic {
  anthropicKey: string;
@@ -22,7 +23,7 @@ export interface SourceSetupAnthropic {
  heliconeKey: string;
 }

-export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptionsOpenAI, AnthropicAccessSchema> = {
+export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicAccessSchema, LLMOptionsOpenAI> = {
  id: 'anthropic',
  name: 'Anthropic',
  rank: 13,
@@ -36,43 +37,48 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptions
  LLMOptionsComponent: OpenAILLMOptions,

  // functions
-  getAccess: (partialSetup): AnthropicAccessSchema => ({
+  getTransportAccess: (partialSetup): AnthropicAccessSchema => ({
    dialect: 'anthropic',
    anthropicKey: partialSetup?.anthropicKey || '',
    anthropicHost: partialSetup?.anthropicHost || null,
    heliconeKey: partialSetup?.heliconeKey || null,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return anthropicCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, /*null, null,*/ maxTokens);
+
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmAnthropic.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
-    throw new Error('Anthropic does not support "Functions" yet');
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Anthropic does not support functions');
+
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmAnthropic.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
+      console.error(`anthropic.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or function calls, or throws a descriptive error string
- */
-async function anthropicCallChatGenerate<TOut = VChatMessageOut>(
-  access: AnthropicAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  // functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmAnthropic.chatGenerate.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
-    console.error(`anthropicCallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -5,11 +5,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { asValidURL } from '~/common/util/urlUtils';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidAzureApiKey, ModelVendorAzure } from './azure.vendor';

@@ -18,7 +18,7 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, sourceHasLLMs, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorAzure.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorAzure);

  // derived state
  const { oaiKey: azureKey, oaiHost: azureEndpoint } = access;
@@ -31,14 +31,8 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = azureKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorAzure, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
 import { AzureIcon } from '~/common/components/icons/AzureIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { AzureSourceSetup } from './AzureSourceSetup';
@@ -36,7 +35,7 @@ export interface SourceSetupAzure {
 *
 * Work in progress...
 */
-export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI, OpenAIAccessSchema> = {
+export const ModelVendorAzure: IModelVendor<SourceSetupAzure, OpenAIAccessSchema, LLMOptionsOpenAI> = {
  id: 'azure',
  name: 'Azure',
  rank: 14,
@@ -50,7 +49,7 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
  LLMOptionsComponent: OpenAILLMOptions,

  // functions
-  getAccess: (partialSetup): OpenAIAccessSchema => ({
+  getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
    dialect: 'azure',
    oaiKey: partialSetup?.azureKey || '',
    oaiOrg: '',
@@ -58,10 +57,9 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport ('azure' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -0,0 +1,96 @@
+import * as React from 'react';
+
+import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
+import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';
+
+import { FormInputKey } from '~/common/components/forms/FormInputKey';
+import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
+import { InlineError } from '~/common/components/InlineError';
+import { Link } from '~/common/components/Link';
+import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
+
+import type { DModelSourceId } from '../../store-llms';
+import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';
+
+import { ModelVendorGemini } from './gemini.vendor';
+
+
+const GEMINI_API_KEY_LINK = 'https://makersuite.google.com/app/apikey';
+
+const SAFETY_OPTIONS: { value: GeminiBlockSafetyLevel, label: string }[] = [
+  { value: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED', label: 'Default' },
+  { value: 'BLOCK_LOW_AND_ABOVE', label: 'Low and above' },
+  { value: 'BLOCK_MEDIUM_AND_ABOVE', label: 'Medium and above' },
+  { value: 'BLOCK_ONLY_HIGH', label: 'Only high' },
+  { value: 'BLOCK_NONE', label: 'None' },
+];
+
+
+export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
+
+  // external state
+  const { source, sourceSetupValid, access, updateSetup } =
+    useSourceSetup(props.sourceId, ModelVendorGemini);
+
+  // derived state
+  const { geminiKey, minSafetyLevel } = access;
+
+  const needsUserKey = !ModelVendorGemini.hasBackendCap?.();
+  const shallFetchSucceed = !needsUserKey || (!!geminiKey && sourceSetupValid);
+  const showKeyError = !!geminiKey && !sourceSetupValid;
+
+  // fetch models
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorGemini, access, shallFetchSucceed, source);
+
+  return <>
+
+    <FormInputKey
+      id='gemini-key' label='Gemini API Key'
+      rightLabel={<>{needsUserKey
+        ? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
+        : '✔️ already set in server'}
+      </>}
+      value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
+      required={needsUserKey} isError={showKeyError}
+      placeholder='...'
+    />
+
+    <FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
+      <FormLabelStart title='Safety Settings'
+                      description='Threshold' />
+      <Select
+        variant='outlined'
+        value={minSafetyLevel} onChange={(_event, value) => value && updateSetup({ minSafetyLevel: value })}
+        startDecorator={<HealthAndSafetyIcon sx={{ display: { xs: 'none', sm: 'inherit' } }} />}
+        // indicator={<KeyboardArrowDownIcon />}
+        slotProps={{
+          root: { sx: { width: '100%' } },
+          indicator: { sx: { opacity: 0.5 } },
+          button: { sx: { whiteSpace: 'inherit' } },
+        }}
+      >
+        {SAFETY_OPTIONS.map(option => (
+          <Option key={'gemini-safety-' + option.value} value={option.value}>{option.label}</Option>
+        ))}
+      </Select>
+    </FormControl>
+
+    <FormHelperText sx={{ display: 'block' }}>
+      Gemini has <Link href='https://ai.google.dev/docs/safety_setting_gemini' target='_blank' noLinkStyle>
+      adjustable safety settings</Link> on four categories: Harassment, Hate speech,
+      Sexually explicit, and Dangerous content, in addition to non-adjustable built-in filters.
+      By default, the model will block content with <em>medium and above</em> probability
+      of being unsafe.
+    </FormHelperText>
+
+    <SetupFormRefetchButton
+      refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
+    />
+
+    {isError && <InlineError error={error} />}
+
+  </>;
+}
@@ -0,0 +1,97 @@
+import GoogleIcon from '@mui/icons-material/Google';
+
+import { backendCaps } from '~/modules/backend/state-backend';
+
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';
+
+import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
+import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
+import type { IModelVendor } from '../IModelVendor';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';
+
+import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
+
+import { GeminiSourceSetup } from './GeminiSourceSetup';
+
+
+export interface SourceSetupGemini {
+  geminiKey: string;
+  minSafetyLevel: GeminiBlockSafetyLevel;
+}
+
+export interface LLMOptionsGemini {
+  llmRef: string;
+  stopSequences: string[];  // up to 5 sequences that will stop generation (optional)
+  candidateCount: number;   // 1...8 number of generated responses to return (optional)
+  maxOutputTokens: number;  // if unset, this will default to outputTokenLimit (optional)
+  temperature: number;      // 0...1 Controls the randomness of the output. (optional)
+  topP: number;             // 0...1 The maximum cumulative probability of tokens to consider when sampling (optional)
+  topK: number;             // 1...100 The maximum number of tokens to consider when sampling (optional)
+}
+
+
+export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSchema, LLMOptionsGemini> = {
+  id: 'googleai',
+  name: 'Gemini',
+  rank: 11,
+  location: 'cloud',
+  instanceLimit: 1,
+  hasBackendCap: () => backendCaps().hasLlmGemini,
+
+  // components
+  Icon: GoogleIcon,
+  SourceSetupComponent: GeminiSourceSetup,
+  LLMOptionsComponent: OpenAILLMOptions,
+
+  // functions
+  initializeSetup: () => ({
+    geminiKey: '',
+    minSafetyLevel: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  }),
+  validateSetup: (setup) => {
+    return setup.geminiKey?.length > 0;
+  },
+  getTransportAccess: (partialSetup): GeminiAccessSchema => ({
+    dialect: 'gemini',
+    geminiKey: partialSetup?.geminiKey || '',
+    minSafetyLevel: partialSetup?.minSafetyLevel || 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
+  }),
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmGemini.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
+  },
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Gemini does not support functions');
+
+    const { llmRef, temperature = 0.5, maxOutputTokens } = llmOptions;
+    try {
+      return await apiAsync.llmGemini.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: temperature,
+          maxTokens: maxTokens || maxOutputTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
+      console.error(`gemini.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
+  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
+};
@@ -7,10 +7,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { ModelVendorLocalAI } from './localai.vendor';

@@ -19,7 +19,7 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorLocalAI.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorLocalAI);

  // derived state
  const { oaiHost } = access;
@@ -30,14 +30,8 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = isValidHost;

  // fetch models - the OpenAI way
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: false, // !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorLocalAI, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);

  return <>

@@ -1,10 +1,9 @@
 import DevicesIcon from '@mui/icons-material/Devices';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { LocalAISourceSetup } from './LocalAISourceSetup';
@@ -14,7 +13,7 @@ export interface SourceSetupLocalAI {
  oaiHost: string;  // use OpenAI-compatible non-default hosts (full origin path)
 }

-export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
+export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
  id: 'localai',
  name: 'LocalAI',
  rank: 20,
@@ -30,7 +29,7 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
  initializeSetup: () => ({
    oaiHost: 'http://localhost:8080',
  }),
-  getAccess: (partialSetup) => ({
+  getTransportAccess: (partialSetup) => ({
    dialect: 'localai',
    oaiKey: '',
    oaiOrg: '',
@@ -38,10 +37,9 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
-};
+
+  // OpenAI transport ('localai' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
+};
@@ -0,0 +1,55 @@
+import * as React from 'react';
+
+import { FormInputKey } from '~/common/components/forms/FormInputKey';
+import { InlineError } from '~/common/components/InlineError';
+import { Link } from '~/common/components/Link';
+import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
+
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';
+
+import { ModelVendorMistral } from './mistral.vendor';
+
+
+const MISTRAL_REG_LINK = 'https://console.mistral.ai/';
+
+
+export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
+
+  // external state
+  const { source, sourceSetupValid, access, updateSetup } =
+    useSourceSetup(props.sourceId, ModelVendorMistral);
+
+  // derived state
+  const { oaiKey: mistralKey } = access;
+
+  const needsUserKey = !ModelVendorMistral.hasBackendCap?.();
+  const shallFetchSucceed = !needsUserKey || (!!mistralKey && sourceSetupValid);
+  const showKeyError = !!mistralKey && !sourceSetupValid;
+
+  // fetch models
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorMistral, access, shallFetchSucceed, source);
+
+  return <>
+
+    <FormInputKey
+      id='mistral-key' label='Mistral Key'
+      rightLabel={<>{needsUserKey
+        ? !mistralKey && <Link level='body-sm' href={MISTRAL_REG_LINK} target='_blank'>request Key</Link>
+        : '✔️ already set in server'}
+      </>}
+      value={mistralKey} onChange={value => updateSetup({ oaiKey: value })}
+      required={needsUserKey} isError={showKeyError}
+      placeholder='...'
+    />
+
+    <SetupFormRefetchButton
+      refetch={refetch} disabled={/*!shallFetchSucceed ||*/ isFetching} error={isError}
+    />
+
+    {isError && <InlineError error={error} />}
+
+  </>;
+}
@@ -0,0 +1,55 @@
+import { backendCaps } from '~/modules/backend/state-backend';
+
+import { MistralIcon } from '~/common/components/icons/MistralIcon';
+
+import type { IModelVendor } from '../IModelVendor';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
+
+import { LLMOptionsOpenAI, ModelVendorOpenAI, SourceSetupOpenAI } from '../openai/openai.vendor';
+import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
+
+import { MistralSourceSetup } from './MistralSourceSetup';
+
+
+// special symbols
+
+export type SourceSetupMistral = Pick<SourceSetupOpenAI, 'oaiKey' | 'oaiHost'>;
+
+
+/** Implementation Notes for the Mistral vendor
+ */
+export const ModelVendorMistral: IModelVendor<SourceSetupMistral, OpenAIAccessSchema, LLMOptionsOpenAI> = {
+  id: 'mistral',
+  name: 'Mistral',
+  rank: 15,
+  location: 'cloud',
+  instanceLimit: 1,
+  hasBackendCap: () => backendCaps().hasLlmMistral,
+
+  // components
+  Icon: MistralIcon,
+  SourceSetupComponent: MistralSourceSetup,
+  LLMOptionsComponent: OpenAILLMOptions,
+
+  // functions
+  initializeSetup: () => ({
+    oaiHost: 'https://api.mistral.ai/',
+    oaiKey: '',
+  }),
+  validateSetup: (setup) => {
+    return setup.oaiKey?.length >= 32;
+  },
+  getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
+    dialect: 'mistral',
+    oaiKey: partialSetup?.oaiKey || '',
+    oaiOrg: '',
+    oaiHost: partialSetup?.oaiHost || '',
+    heliKey: '',
+    moderationCheck: false,
+  }),
+
+  // OpenAI transport ('mistral' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
+};
@@ -1,24 +1,29 @@
 import * as React from 'react';

-import { Box, Button, Chip, FormControl, Input, Option, Select, Stack, Typography } from '@mui/joy';
+import { Box, Button, Chip, FormControl, IconButton, Input, Option, Select, Stack, Typography } from '@mui/joy';
+import LaunchIcon from '@mui/icons-material/Launch';
+import FormatListNumberedRtlIcon from '@mui/icons-material/FormatListNumberedRtl';

 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { GoodModal } from '~/common/components/GoodModal';
+import { GoodTooltip } from '~/common/components/GoodTooltip';
+import { InlineError } from '~/common/components/InlineError';
+import { Link } from '~/common/components/Link';
 import { apiQuery } from '~/common/util/trpc.client';
 import { settingsGap } from '~/common/app.theme';

-import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
-import { InlineError } from '~/common/components/InlineError';
+import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';


-export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () => void }) {
+export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {

  // state
+  const [sortByPulls, setSortByPulls] = React.useState<boolean>(false);
  const [modelName, setModelName] = React.useState<string | null>('llama2');
  const [modelTag, setModelTag] = React.useState<string>('');

  // external state
-  const { data: pullable } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
+  const { data: pullableData } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
    staleTime: 1000 * 60,
    refetchOnWindowFocus: false,
  });
@@ -26,7 +31,11 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
  const { isLoading: isDeleting, status: deleteStatus, error: deleteError, mutate: deleteMutate, reset: deleteReset } = apiQuery.llmOllama.adminDelete.useMutation();

  // derived state
-  const pullModelDescription = pullable?.pullable.find(p => p.id === modelName)?.description ?? null;
+  let pullable = pullableData?.pullable || [];
+  if (sortByPulls)
+    pullable = pullable.toSorted((a, b) => b.pulls - a.pulls);
+  const pullModelDescription = pullable.find(p => p.id === modelName)?.description ?? null;
+

  const handleModelPull = () => {
    deleteReset();
@@ -38,6 +47,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
    modelName && deleteMutate({ access: props.access, name: modelName + (modelTag ? ':' + modelTag : '') });
  };

+
  return (
    <GoodModal title='Ollama Administration' dividers open onClose={props.onClose}>

@@ -47,25 +57,48 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
          However we provide a way to pull models from the Ollama host, for convenience.
        </Typography>

-        <Box sx={{ display: 'flex', gap: 1 }}>
-          <FormControl sx={{ flexGrow: 1 }}>
+        <Box sx={{ display: 'flex', flexFlow: 'row wrap', gap: 1 }}>
+          <FormControl sx={{ flexGrow: 1, flexBasis: 0.55 }}>
            <FormLabelStart title='Name' />
-            <Select value={modelName || ''} onChange={(_event: any, value: string | null) => setModelName(value)}>
-              {pullable?.pullable.map(p =>
-                <Option key={p.id} value={p.id}>
-                  {p.isNew === true && <Chip size='sm' variant='outlined'>New</Chip>} {p.label}
-                </Option>,
-              )}
-            </Select>
+            <Box sx={{ display: 'flex', gap: 1 }}>
+              <Select
+                value={modelName || ''}
+                onChange={(_event: any, value: string | null) => setModelName(value)}
+                sx={{ flexGrow: 1 }}
+              >
+                {pullable.map(p =>
+                  <Option key={p.id} value={p.id}>
+                    {p.isNew === true && <Chip size='sm' variant='solid'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
+                  </Option>,
+                )}
+              </Select>
+              <GoodTooltip title='Sort by Downloads'>
+                <IconButton
+                  variant={sortByPulls ? 'solid' : 'outlined'}
+                  onClick={() => setSortByPulls(!sortByPulls)}
+                >
+                  <FormatListNumberedRtlIcon />
+                </IconButton>
+              </GoodTooltip>
+            </Box>
          </FormControl>
-          <FormControl sx={{ flexGrow: 1 }}>
+          <FormControl sx={{ flexGrow: 1, flexBasis: 0.45 }}>
            <FormLabelStart title='Tag' />
-            <Input
-              variant='outlined' placeholder='latest'
-              value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
-              sx={{ minWidth: 100 }}
-              slotProps={{ input: { size: 10 } }} // halve the min width
-            />
+            <Box sx={{ display: 'flex', gap: 1 }}>
+              <Input
+                variant='outlined' placeholder='latest'
+                value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
+                sx={{ minWidth: 80, flexGrow: 1 }}
+                slotProps={{ input: { size: 10 } }} // halve the min width
+              />
+              {!!modelName && (
+                <IconButton
+                  component={Link} href={`https://ollama.ai/library/${modelName}`} target='_blank'
+                >
+                  <LaunchIcon />
+                </IconButton>
+              )}
+            </Box>
          </FormControl>
        </Box>

@@ -85,7 +118,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
            {pullModelDescription}
          </Typography>

-          <Box sx={{ display: 'flex', gap: 1 }}>
+          <Box sx={{ display: 'flex', flexWrap: 1, gap: 1, alignItems: 'start' }}>
            <Button
              variant='outlined'
              color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
@@ -6,13 +6,14 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { asValidURL } from '~/common/util/urlUtils';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';
+
 import { ModelVendorOllama } from './ollama.vendor';
-import { OllamaAdmin } from './OllamaAdmin';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { OllamaAdministration } from './OllamaAdministration';


 export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
@@ -22,7 +23,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorOllama.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorOllama);

  // derived state
  const { ollamaHost } = access;
@@ -32,14 +33,8 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = !hostError;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOllama.listModels.useQuery({ access }, {
-    enabled: false, // !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOllama, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);

  return <>

@@ -63,7 +58,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {

    {isError && <InlineError error={error} />}

-    {adminOpen && <OllamaAdmin access={access} onClose={() => setAdminOpen(false)} />}
+    {adminOpen && <OllamaAdministration access={access} onClose={() => setAdminOpen(false)} />}

  </>;
 }
@@ -1,13 +1,14 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { OllamaIcon } from '~/common/components/icons/OllamaIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
-import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
-import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
+import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
+import type { VChatMessageOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

-import { LLMOptionsOpenAI } from '../openai/openai.vendor';
+import type { LLMOptionsOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OllamaSourceSetup } from './OllamaSourceSetup';
@@ -18,7 +19,7 @@ export interface SourceSetupOllama {
 }


-export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI, OllamaAccessSchema> = {
+export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSchema, LLMOptionsOpenAI> = {
  id: 'ollama',
  name: 'Ollama',
  rank: 22,
@@ -32,40 +33,45 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI
  LLMOptionsComponent: OpenAILLMOptions,

  // functions
-  getAccess: (partialSetup): OllamaAccessSchema => ({
+  getTransportAccess: (partialSetup): OllamaAccessSchema => ({
    dialect: 'ollama',
    ollamaHost: partialSetup?.ollamaHost || '',
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return ollamaCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, maxTokens);
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmOllama.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
-    throw new Error('Ollama does not support "Functions" yet');
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    if (functions?.length || forceFunctionName)
+      throw new Error('Ollama does not support functions');
+
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmOllama.chatGenerate.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        history: messages,
+      }) as VChatMessageOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
+      console.error(`ollama.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or throws a descriptive error string
- */
-async function ollamaCallChatGenerate<TOut = VChatMessageOut>(
-  access: OllamaAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmOllama.chatGenerate.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
-    console.error(`ollamaCallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -6,10 +6,10 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { ModelVendorOoobabooga } from './oobabooga.vendor';

@@ -18,20 +18,14 @@ export function OobaboogaSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, sourceHasLLMs, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorOoobabooga.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorOoobabooga);

  // derived state
  const { oaiHost } = access;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: false, // !hasModels && !!asValidURL(normSetup.oaiHost),
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOoobabooga, access, false /* !hasModels && !!asValidURL(normSetup.oaiHost) */, source);

  return <>

@@ -1,10 +1,9 @@
 import { OobaboogaIcon } from '~/common/components/icons/OobaboogaIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OobaboogaSourceSetup } from './OobaboogaSourceSetup';
@@ -14,7 +13,7 @@ export interface SourceSetupOobabooga {
  oaiHost: string;  // use OpenAI-compatible non-default hosts (full origin path)
 }

-export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOptionsOpenAI, OpenAIAccessSchema> = {
+export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, OpenAIAccessSchema, LLMOptionsOpenAI> = {
  id: 'oobabooga',
  name: 'Oobabooga',
  rank: 25,
@@ -30,7 +29,7 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
  initializeSetup: (): SourceSetupOobabooga => ({
    oaiHost: 'http://127.0.0.1:5000',
  }),
-  getAccess: (partialSetup): OpenAIAccessSchema => ({
+  getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
    dialect: 'oobabooga',
    oaiKey: '',
    oaiOrg: '',
@@ -38,10 +37,9 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport (oobabooga dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -9,13 +9,13 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
 import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';

-import type { ModelDescriptionSchema } from '../../transports/server/server.schemas';
-import { DLLM, DModelSource, DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

-import { isValidOpenAIApiKey, LLMOptionsOpenAI, ModelVendorOpenAI } from './openai.vendor';
+import { isValidOpenAIApiKey, ModelVendorOpenAI } from './openai.vendor';


 // avoid repeating it all over
@@ -29,7 +29,7 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, sourceHasLLMs, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorOpenAI.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorOpenAI);

  // derived state
  const { oaiKey, oaiOrg, oaiHost, heliKey, moderationCheck } = access;
@@ -40,15 +40,8 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
-
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOpenAI, access, !sourceHasLLMs && shallFetchSucceed, source);

  return <>

@@ -110,30 +103,3 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {

  </>;
 }
-
-
-export function modelDescriptionToDLLM<TSourceSetup>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, LLMOptionsOpenAI> {
-  const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
-  const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
-  return {
-    id: `${source.id}-${model.id}`,
-
-    label: model.label,
-    created: model.created || 0,
-    updated: model.updated || 0,
-    description: model.description,
-    tags: [], // ['stream', 'chat'],
-    contextTokens: model.contextWindow,
-    maxOutputTokens: maxOutputTokens,
-    hidden: !!model.hidden,
-
-    sId: source.id,
-    _source: source,
-
-    options: {
-      llmRef: model.id,
-      llmTemperature: 0.5,
-      llmResponseTokens: llmResponseTokens,
-    },
-  };
-}
@@ -1,11 +1,12 @@
 import { backendCaps } from '~/modules/backend/state-backend';

 import { OpenAIIcon } from '~/common/components/icons/OpenAIIcon';
-import { apiAsync } from '~/common/util/trpc.client';
+import { apiAsync, apiQuery } from '~/common/util/trpc.client';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
+import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
+import { unifiedStreamingClient } from '../unifiedStreamingClient';

 import { OpenAILLMOptions } from './OpenAILLMOptions';
 import { OpenAISourceSetup } from './OpenAISourceSetup';
@@ -28,7 +29,7 @@ export interface LLMOptionsOpenAI {
  llmResponseTokens: number;
 }

-export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
+export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
  id: 'openai',
  name: 'OpenAI',
  rank: 10,
@@ -42,7 +43,7 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
  LLMOptionsComponent: OpenAILLMOptions,

  // functions
-  getAccess: (partialSetup): OpenAIAccessSchema => ({
+  getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
    dialect: 'openai',
    oaiKey: '',
    oaiOrg: '',
@@ -51,41 +52,40 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
    moderationCheck: false,
    ...partialSetup,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    const access = this.getAccess(llm._source.setup);
-    return openAICallChatGenerate(access, llm.options, messages, null, null, maxTokens);
+
+  // List Models
+  rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
+    return apiQuery.llmOpenAI.listModels.useQuery({ access }, {
+      enabled: enabled,
+      onSuccess: onSuccess,
+      refetchOnWindowFocus: false,
+      staleTime: Infinity,
+    });
  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    const access = this.getAccess(llm._source.setup);
-    return openAICallChatGenerate(access, llm.options, messages, functions, forceFunctionName, maxTokens);
+
+  // Chat Generate (non-streaming) with Functions
+  rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
+    const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
+    try {
+      return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
+        access,
+        model: {
+          id: llmRef!,
+          temperature: llmTemperature,
+          maxTokens: maxTokens || llmResponseTokens || 1024,
+        },
+        functions: functions ?? undefined,
+        forceFunctionName: forceFunctionName ?? undefined,
+        history: messages,
+      }) as VChatMessageOrFunctionCallOut;
+    } catch (error: any) {
+      const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
+      console.error(`openai.rpcChatGenerateOrThrow: ${errorMessage}`);
+      throw new Error(errorMessage);
+    }
  },
+
+  // Chat Generate (streaming) with Functions
+  streamingChatGenerateOrThrow: unifiedStreamingClient,
+
 };
-
-
-/**
- * This function either returns the LLM message, or function calls, or throws a descriptive error string
- */
-export async function openAICallChatGenerate<TOut = VChatMessageOut | VChatMessageOrFunctionCallOut>(
-  access: OpenAIAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
-  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
-  maxTokens?: number,
-): Promise<TOut> {
-  const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
-  try {
-    return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
-      access,
-      model: {
-        id: llmRef!,
-        temperature: llmTemperature,
-        maxTokens: maxTokens || llmResponseTokens || 1024,
-      },
-      functions: functions ?? undefined,
-      forceFunctionName: forceFunctionName ?? undefined,
-      history: messages,
-    }) as TOut;
-  } catch (error: any) {
-    const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
-    console.error(`openAICallChatGenerate: ${errorMessage}`);
-    throw new Error(errorMessage);
-  }
-}
@@ -1,15 +1,16 @@
 import * as React from 'react';

-import { Typography } from '@mui/joy';
+import { Button, Typography } from '@mui/joy';

 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
-import { apiQuery } from '~/common/util/trpc.client';
+import { getCallbackUrl } from '~/common/app.routes';

-import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
-import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
+import { DModelSourceId } from '../../store-llms';
+import { useLlmUpdateModels } from '../useLlmUpdateModels';
+import { useSourceSetup } from '../useSourceSetup';

 import { isValidOpenRouterKey, ModelVendorOpenRouter } from './openrouter.vendor';

@@ -18,7 +19,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {

  // external state
  const { source, sourceHasLLMs, access, updateSetup } =
-    useSourceSetup(props.sourceId, ModelVendorOpenRouter.getAccess);
+    useSourceSetup(props.sourceId, ModelVendorOpenRouter);

  // derived state
  const { oaiKey } = access;
@@ -29,31 +30,33 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
  const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;

  // fetch models
-  const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
-    enabled: !sourceHasLLMs && shallFetchSucceed,
-    onSuccess: models => source && useModelsStore.getState().setLLMs(
-      models.models.map(model => modelDescriptionToDLLM(model, source)),
-      props.sourceId,
-    ),
-    staleTime: Infinity,
-  });
+  const { isFetching, refetch, isError, error } =
+    useLlmUpdateModels(ModelVendorOpenRouter, access, !sourceHasLLMs && shallFetchSucceed, source);
+
+
+  const handleOpenRouterLogin = () => {
+    // replace the current page with the OAuth page
+    const callbackUrl = getCallbackUrl('openrouter');
+    const oauthUrl = 'https://openrouter.ai/auth?callback_url=' + encodeURIComponent(callbackUrl);
+    window.open(oauthUrl, '_self');
+    // ...bye / see you soon at the callback location...
+  };
+

  return <>

-    {/*<Box sx={{ display: 'flex', gap: 1, alignItems: 'center' }}>*/}
-    {/*<OpenRouterIcon />*/}
    <Typography level='body-sm'>
-      <Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent, premium service
+      <Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent service
      granting access to <Link href='https://openrouter.ai/docs#models' target='_blank'>exclusive models</Link> such
-      as GPT-4 32k, Claude, and more, typically unavailable to the public. <Link
-      href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md'>Configuration &amp; documentation</Link>.
+      as GPT-4 32k, Claude, and more. <Link
+      href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md' target='_blank'>
+      Configuration &amp; documentation</Link>.
    </Typography>
-    {/*</Box>*/}

    <FormInputKey
      id='openrouter-key' label='OpenRouter API Key'
      rightLabel={<>{needsUserKey
-        ? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>create key</Link>
+        ? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
        : '✔️ already set in server'
      } {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
      </>}
@@ -62,7 +65,23 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
      placeholder='sk-or-...'
    />

-    <SetupFormRefetchButton refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError} />
+    <Typography level='body-sm'>
+      🎁 A selection of <Link href='https://openrouter.ai/docs#models' target='_blank'>OpenRouter models</Link> are
+      made available without charge. You can get an API key by using the Login button below.
+    </Typography>
+
+    <SetupFormRefetchButton
+      refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
+      leftButton={
+        <Button
+          color='neutral' variant={(needsUserKey && !keyValid) ? 'solid' : 'outlined'}
+          onClick={handleOpenRouterLogin}
+          endDecorator={(needsUserKey && !keyValid) ? '🎁' : undefined}
+        >
+          OpenRouter Login
+        </Button>
+      }
+    />

    {isError && <InlineError error={error} />}

@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
 import { OpenRouterIcon } from '~/common/components/icons/OpenRouterIcon';

 import type { IModelVendor } from '../IModelVendor';
-import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
-import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
+import type { OpenAIAccessSchema } from '../../server/openai/openai.router';

-import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
+import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
 import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';

 import { OpenRouterSourceSetup } from './OpenRouterSourceSetup';
@@ -32,12 +31,13 @@ export interface SourceSetupOpenRouter {
 *  [x] decide whether to do UI work to improve the appearance - prioritized models
 *  [x] works!
 */
-export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptionsOpenAI, OpenAIAccessSchema> = {
+export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, OpenAIAccessSchema, LLMOptionsOpenAI> = {
  id: 'openrouter',
  name: 'OpenRouter',
  rank: 12,
  location: 'cloud',
  instanceLimit: 1,
+  hasFreeModels: true,
  hasBackendCap: () => backendCaps().hasLlmOpenRouter,

  // components
@@ -50,7 +50,7 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
    oaiHost: 'https://openrouter.ai/api',
    oaiKey: '',
  }),
-  getAccess: (partialSetup): OpenAIAccessSchema => ({
+  getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
    dialect: 'openrouter',
    oaiKey: partialSetup?.oaiKey || '',
    oaiOrg: '',
@@ -58,10 +58,9 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
    heliKey: '',
    moderationCheck: false,
  }),
-  callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
-  },
-  callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
-    return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
-  },
+
+  // OpenAI transport ('openrouter' dialect in 'access')
+  rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
+  rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
+  streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
 };
@@ -1,11 +1,10 @@
 import { apiAsync } from '~/common/util/trpc.client';

-import type { DLLM, DLLMId } from '../store-llms';
-import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
+import type { ChatStreamingFirstOutputPacketSchema, ChatStreamingInputSchema } from '../server/llm.server.streaming';
+import type { DLLMId } from '../store-llms';
+import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';

-import type { ChatStreamFirstPacketSchema, ChatStreamInputSchema } from './server/openai/openai.streaming';
-import type { OpenAIWire } from './server/openai/openai.wiretypes';
-import type { VChatMessageIn } from './chatGenerate';
+import type { OpenAIWire } from '../server/openai/openai.wiretypes';


 /**
@@ -15,27 +14,14 @@ import type { VChatMessageIn } from './chatGenerate';
 * Vendor-specific implementation is on our server backend (API) code. This function tries to be
 * as generic as possible.
 *
- * @param llmId LLM to use
- * @param messages the history of messages to send to the API endpoint
- * @param abortSignal used to initiate a client-side abort of the fetch request to the API endpoint
- * @param onUpdate callback when a piece of a message (text, model name, typing..) is received
+ * NOTE: onUpdate is callback when a piece of a message (text, model name, typing..) is received
 */
-export async function streamChat(
+export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions = unknown>(
+  access: ChatStreamingInputSchema['access'],
  llmId: DLLMId,
+  llmOptions: TLLMOptions,
  messages: VChatMessageIn[],
-  abortSignal: AbortSignal,
-  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
-): Promise<void> {
-  const { llm, vendor } = findVendorForLlmOrThrow(llmId);
-  const access = vendor.getAccess(llm._source.setup) as ChatStreamInputSchema['access'];
-  return await vendorStreamChat(access, llm, messages, abortSignal, onUpdate);
-}
-
-
-async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
-  access: ChatStreamInputSchema['access'],
-  llm: DLLM<TSourceSetup, TLLMOptions>,
-  messages: VChatMessageIn[],
+  functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
  abortSignal: AbortSignal,
  onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
 ) {
@@ -79,12 +65,12 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
  }

  // model params (llm)
-  const { llmRef, llmTemperature, llmResponseTokens } = (llm.options as any) || {};
+  const { llmRef, llmTemperature, llmResponseTokens } = (llmOptions as any) || {};
  if (!llmRef || llmTemperature === undefined || llmResponseTokens === undefined)
-    throw new Error(`Error in configuration for model ${llm.id}: ${JSON.stringify(llm.options)}`);
+    throw new Error(`Error in configuration for model ${llmId}: ${JSON.stringify(llmOptions)}`);

  // prepare the input, similarly to the tRPC openAI.chatGenerate
-  const input: ChatStreamInputSchema = {
+  const input: ChatStreamingInputSchema = {
    access,
    model: {
      id: llmRef,
@@ -131,7 +117,7 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
      incrementalText = incrementalText.substring(endOfJson + 1);
      parsedFirstPacket = true;
      try {
-        const parsed: ChatStreamFirstPacketSchema = JSON.parse(json);
+        const parsed: ChatStreamingFirstOutputPacketSchema = JSON.parse(json);
        onUpdate({ originLLM: parsed.model }, false);
      } catch (e) {
        // error parsing JSON, ignore
@@ -0,0 +1,47 @@
+import type { IModelVendor } from './IModelVendor';
+import type { ModelDescriptionSchema } from '../server/llm.server.types';
+import { DLLM, DModelSource, useModelsStore } from '../store-llms';
+
+
+/**
+ * Hook that fetches the list of models from the vendor and updates the store,
+ * while returning the fetch state.
+ */
+export function useLlmUpdateModels<TSourceSetup, TAccess, TLLMOptions>(vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>, access: TAccess, enabled: boolean, source: DModelSource<TSourceSetup>) {
+  return vendor.rpcUpdateModelsQuery(access, enabled, data => source && updateModelsFn(data, source));
+}
+
+
+function updateModelsFn<TSourceSetup>(data: { models: ModelDescriptionSchema[] }, source: DModelSource<TSourceSetup>) {
+  useModelsStore.getState().setLLMs(
+    data.models.map(model => modelDescriptionToDLLMOpenAIOptions(model, source)),
+    source.id,
+  );
+}
+
+function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, TLLMOptions> {
+  const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
+  const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
+  return {
+    id: `${source.id}-${model.id}`,
+
+    label: model.label,
+    created: model.created || 0,
+    updated: model.updated || 0,
+    description: model.description,
+    tags: [], // ['stream', 'chat'],
+    contextTokens: model.contextWindow,
+    maxOutputTokens: maxOutputTokens,
+    hidden: !!model.hidden,
+
+    sId: source.id,
+    _source: source,
+
+    options: {
+      llmRef: model.id,
+      // @ts-ignore FIXME: large assumption that this is LLMOptionsOpenAI object
+      llmTemperature: 0.5,
+      llmResponseTokens: llmResponseTokens,
+    },
+  };
+}
@@ -0,0 +1,35 @@
+import { shallow } from 'zustand/shallow';
+
+import type { IModelVendor } from './IModelVendor';
+import { DModelSource, DModelSourceId, useModelsStore } from '../store-llms';
+
+
+/**
+ * Source-specific read/write - great time saver
+ */
+export function useSourceSetup<TSourceSetup, TAccess, TLLMOptions>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>) {
+
+  // invalidates only when the setup changes
+  const { updateSourceSetup, ...rest } = useModelsStore(state => {
+
+    // find the source (or null)
+    const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
+
+    // (safe) source-derived properties
+    const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
+    const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
+    const access = vendor.getTransportAccess(source?.setup);
+
+    return {
+      source,
+      access,
+      sourceHasLLMs: !!sourceLLMs.length,
+      sourceSetupValid,
+      updateSourceSetup: state.updateSourceSetup,
+    };
+  }, shallow);
+
+  // convenience function for this source
+  const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
+  return { ...rest, updateSetup };
+}
@@ -1,24 +1,39 @@
 import { ModelVendorAnthropic } from './anthropic/anthropic.vendor';
 import { ModelVendorAzure } from './azure/azure.vendor';
+import { ModelVendorGemini } from './gemini/gemini.vendor';
 import { ModelVendorLocalAI } from './localai/localai.vendor';
+import { ModelVendorMistral } from './mistral/mistral.vendor';
 import { ModelVendorOllama } from './ollama/ollama.vendor';
 import { ModelVendorOoobabooga } from './oobabooga/oobabooga.vendor';
 import { ModelVendorOpenAI } from './openai/openai.vendor';
 import { ModelVendorOpenRouter } from './openrouter/openrouter.vendor';

+import type { IModelVendor } from './IModelVendor';
 import { DLLMId, DModelSource, DModelSourceId, findLLMOrThrow } from '../store-llms';
-import { IModelVendor, ModelVendorId } from './IModelVendor';

-/** Vendor Instances Registry **/
+export type ModelVendorId =
+  | 'anthropic'
+  | 'azure'
+  | 'googleai'
+  | 'localai'
+  | 'mistral'
+  | 'ollama'
+  | 'oobabooga'
+  | 'openai'
+  | 'openrouter';
+
+/** Global: Vendor Instances Registry **/
 const MODEL_VENDOR_REGISTRY: Record<ModelVendorId, IModelVendor> = {
  anthropic: ModelVendorAnthropic,
  azure: ModelVendorAzure,
+  googleai: ModelVendorGemini,
  localai: ModelVendorLocalAI,
+  mistral: ModelVendorMistral,
  ollama: ModelVendorOllama,
  oobabooga: ModelVendorOoobabooga,
  openai: ModelVendorOpenAI,
  openrouter: ModelVendorOpenRouter,
-};
+} as Record<string, IModelVendor>;

 const MODEL_VENDOR_DEFAULT: ModelVendorId = 'openai';

@@ -29,13 +44,15 @@ export function findAllVendors(): IModelVendor[] {
  return modelVendors;
 }

-export function findVendorById(vendorId?: ModelVendorId): IModelVendor | null {
-  return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] ?? null) : null;
+export function findVendorById<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
+  vendorId?: ModelVendorId,
+): IModelVendor<TSourceSetup, TAccess, TLLMOptions> | null {
+  return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] as IModelVendor<TSourceSetup, TAccess, TLLMOptions>) ?? null : null;
 }

-export function findVendorForLlmOrThrow(llmId: DLLMId) {
-  const llm = findLLMOrThrow(llmId);
-  const vendor = findVendorById(llm?._source.vId);
+export function findVendorForLlmOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(llmId: DLLMId) {
+  const llm = findLLMOrThrow<TSourceSetup, TLLMOptions>(llmId);
+  const vendor = findVendorById<TSourceSetup, TAccess, TLLMOptions>(llm?._source.vId);
  if (!vendor) throw new Error(`callChat: Vendor not found for LLM ${llmId}`);
  return { llm, vendor };
 }
@@ -3,9 +3,10 @@ import { createTRPCRouter } from './trpc.server';
 import { backendRouter } from '~/modules/backend/backend.router';
 import { elevenlabsRouter } from '~/modules/elevenlabs/elevenlabs.router';
 import { googleSearchRouter } from '~/modules/google/search.router';
-import { llmAnthropicRouter } from '~/modules/llms/transports/server/anthropic/anthropic.router';
-import { llmOllamaRouter } from '~/modules/llms/transports/server/ollama/ollama.router';
-import { llmOpenAIRouter } from '~/modules/llms/transports/server/openai/openai.router';
+import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.router';
+import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
+import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
+import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
 import { prodiaRouter } from '~/modules/prodia/prodia.router';
 import { ytPersonaRouter } from '../../apps/personas/ytpersona.router';

@@ -17,6 +18,7 @@ export const appRouterEdge = createTRPCRouter({
  elevenlabs: elevenlabsRouter,
  googleSearch: googleSearchRouter,
  llmAnthropic: llmAnthropicRouter,
+  llmGemini: llmGeminiRouter,
  llmOllama: llmOllamaRouter,
  llmOpenAI: llmOpenAIRouter,
  prodia: prodiaRouter,
@@ -5,8 +5,8 @@ export const env = createEnv({
  server: {

    // Backend Postgres, for optional storage via Prisma
-    POSTGRES_PRISMA_URL: z.string().url().optional(),
-    POSTGRES_URL_NON_POOLING: z.string().url().optional(),
+    POSTGRES_PRISMA_URL: z.string().optional(),
+    POSTGRES_URL_NON_POOLING: z.string().optional(),

    // LLM: OpenAI
    OPENAI_API_KEY: z.string().optional(),
@@ -21,6 +21,12 @@ export const env = createEnv({
    ANTHROPIC_API_KEY: z.string().optional(),
    ANTHROPIC_API_HOST: z.string().url().optional(),

+    // LLM: Google AI's Gemini
+    GEMINI_API_KEY: z.string().optional(),
+
+    // LLM: Mistral
+    MISTRAL_API_KEY: z.string().optional(),
+
    // LLM: Ollama
    OLLAMA_API_HOST: z.string().url().optional(),

@@ -59,6 +65,9 @@ export const env = createEnv({
    throw new Error('Invalid environment variable');
  },

+  // matches user expectations - see https://github.com/enricoros/big-AGI/issues/279
+  emptyStringAsUndefined: true,
+
  // with Noext.JS >= 13.4.4 we'd only need to destructure client variables
  experimental__runtimeEnv: {},
 });
Author	SHA1	Message	Date
Enrico Ros	0fc83cf6f5	Merge branch 'release-1.8.0'	2023-12-20 02:38:51 -08:00
Enrico Ros	2949feccd5	Maintainers Release	2023-12-20 02:32:47 -08:00
Enrico Ros	d6f1c2da81	1.8.0: Readme and Changelog	2023-12-20 02:11:13 -08:00
Enrico Ros	fabb433fde	1.8.0: news.data.tsx	2023-12-20 01:54:23 -08:00
Enrico Ros	b57445eb14	1.8.0: Version	2023-12-20 01:11:08 -08:00
Enrico Ros	5f8f4aba78	Ollama: update models	2023-12-20 00:59:14 -08:00
Enrico Ros	d693cdaeba	Ollama: update admin panel	2023-12-20 00:59:03 -08:00
Enrico Ros	39fbcfd97b	OpenRouter: update models	2023-12-20 00:55:27 -08:00
Enrico Ros	7694bc3d52	OpenRouter: update models	2023-12-20 00:53:16 -08:00
Enrico Ros	7f21b2ac3d	Merge branch 'feature-gemini' Fixes #275	2023-12-20 00:16:44 -08:00
Enrico Ros	fdb66da1a7	Gemini: choose a content filtering threshold	2023-12-20 00:14:53 -08:00
Enrico Ros	6b62a6733b	Gemini: show block reason	2023-12-20 00:14:53 -08:00
Enrico Ros	5d62056807	Streaming: muxing format	2023-12-20 00:14:53 -08:00
Enrico Ros	efff7126af	Gemini: final touches	2023-12-20 00:14:53 -08:00
Enrico Ros	45046c70ed	Gemini: stream on	2023-12-20 00:14:53 -08:00
Enrico Ros	7b5b852793	Gemini: trim key	2023-12-20 00:14:53 -08:00
Enrico Ros	9952b757b8	Gemini: client version	2023-12-20 00:14:53 -08:00
Enrico Ros	b08ecc9012	Models Modal: improve caps	2023-12-20 00:14:53 -08:00
Enrico Ros	bc5a38fa89	Models List: show a helpful message	2023-12-20 00:14:53 -08:00
Enrico Ros	bee49a4b1c	Llms: streaming as a vendor function (then all directed to the unified)	2023-12-20 00:14:53 -08:00
Enrico Ros	0ece1ce58c	Llms: vendor-specific RPC to ChatGenerate	2023-12-20 00:14:53 -08:00
Enrico Ros	fd897b55b2	Llms: improve list generics	2023-12-20 00:14:53 -08:00
Enrico Ros	dd41a402d0	Llms: move models modal	2023-12-20 00:14:53 -08:00
Enrico Ros	3f9defd18c	Llms: restructure	2023-12-20 00:14:53 -08:00
Enrico Ros	49c77f5a10	Llms: cleanup model lists (bits)	2023-12-20 00:14:52 -08:00
Enrico Ros	6b2bfa6060	Llms: cleanup model lists	2023-12-20 00:14:52 -08:00
Enrico Ros	8e3f247bfb	Gemini: cleaner	2023-12-20 00:14:52 -08:00
Enrico Ros	201e3a7252	Streaming: cleanup	2023-12-20 00:14:52 -08:00
Enrico Ros	044ed4df79	Bits for the future	2023-12-20 00:14:52 -08:00
Enrico Ros	0df7297cca	Gemini: configuration, list models, and immediate generation	2023-12-20 00:14:52 -08:00
Enrico Ros	453a3e5751	LLM Vendors: auto IDs	2023-12-20 00:14:52 -08:00
Enrico Ros	34c1c425b9	Gemini: backend env var	2023-12-20 00:14:52 -08:00
Enrico Ros	e0a010189f	LLMOptions Modal: fix display	2023-12-20 00:14:52 -08:00
Enrico Ros	7a07f10ed1	Move ModelVendor enum	2023-12-20 00:14:52 -08:00
Enrico Ros	33cb2b84b2	Anthropic: allow for 39 chars sks	2023-12-20 00:13:58 -08:00
Enrico Ros	3adec85e1f	Fix shortcuts on Mac.	2023-12-18 19:59:03 -08:00
Enrico Ros	18cfe5e296	DB: drop URL validation for POSTGRES_PRISMA_URL. #277	2023-12-18 15:16:02 -08:00
Enrico Ros	566ba366b4	Merge pull request #280 [Visualize] Add custom instruction #218	2023-12-18 12:19:03 -08:00
Enrico Ros	7ed653b315	Fix.	2023-12-18 04:54:04 -08:00
Enrico Ros	cb333c33d7	Better 1-click deployment, fixes #279	2023-12-18 03:22:18 -08:00
Joris Kalz	22ba37074b	[Visualize] Add custom instruction #218	2023-12-16 23:22:47 +01:00
Enrico Ros	84d7b7644a	Ollama: update models	2023-12-15 15:48:41 -08:00
Enrico Ros	71445dafc8	Ollama: improved diagram	2023-12-15 15:29:56 -08:00
Enrico Ros	66a5ad7f00	Ollama: update md	2023-12-15 15:27:11 -08:00
Enrico Ros	09f80adfaa	Ollama: update md	2023-12-15 15:26:38 -08:00
Enrico Ros	9febd97065	Ollama: update md	2023-12-15 15:24:48 -08:00
Enrico Ros	5219f9928d	Ollama: update md	2023-12-15 15:24:13 -08:00
Enrico Ros	aec9f4665f	Update config-ollama.md	2023-12-15 15:23:48 -08:00
Enrico Ros	db48465204	Ollama: document network issue resolution. #276	2023-12-15 15:20:33 -08:00
Enrico Ros	c2c858730a	Bite the bullet with Zustand	2023-12-13 14:57:06 -08:00
Enrico Ros	402bde9a81	Newpad	2023-12-13 02:06:19 -08:00
Enrico Ros	ba1c0ba0d9	Enforce a Single instance (Tab) of the app. Closes #268	2023-12-13 00:09:56 -08:00
Enrico Ros	084d77cd78	Linting	2023-12-12 18:24:59 -08:00
Enrico Ros	30c17a9b73	Roll Joy	2023-12-12 18:10:46 -08:00
Enrico Ros	2442463da3	deploy-docker.md: update Official guide	2023-12-12 17:52:28 -08:00
Enrico Ros	84a3e8cfdb	Fix docker-compose to point to the 'latest' (stable) version, instead of the no more existing 'main'	2023-12-12 17:17:30 -08:00
Enrico Ros	6ae440d252	1.7.3: Patch release for Mistral support	2023-12-12 17:01:40 -08:00
Enrico Ros	c0c724afc1	Mistral Platform: full support Closes #273.	2023-12-12 16:39:06 -08:00
Enrico Ros	a265112ce1	Mistral Platform: backend-configurable support (#273 )	2023-12-12 16:39:06 -08:00
Enrico Ros	75605ed408	Dropdown: support model vendor icons	2023-12-12 16:39:06 -08:00
Enrico Ros	ad38ff4157	LLMs: safer and smarter access	2023-12-12 16:39:06 -08:00
Enrico Ros	08c60e53b1	LLMs: reorder template params	2023-12-12 16:39:06 -08:00
Enrico Ros	d0dcb2ac02	LLMs: getTransportAccess	2023-12-12 16:39:06 -08:00
Enrico Ros	fbeb604b26	Update README.md	2023-12-12 03:42:05 -08:00
Enrico Ros	c4f3b1df77	Update README.md	2023-12-12 03:40:44 -08:00
Enrico Ros	5a1f9caaac	Roll rest	2023-12-12 03:16:35 -08:00
Enrico Ros	2fc70d5e95	Roll other dev deps	2023-12-12 03:12:43 -08:00
Enrico Ros	43adadef78	Roll Material/Joy/Next	2023-12-12 03:11:14 -08:00
Enrico Ros	96f6e7628b	Roll Prisma	2023-12-12 03:08:10 -08:00
Enrico Ros	32ad82bcee	Drag/Drop: do not remove the text from the source	2023-12-12 03:07:31 -08:00
Enrico Ros	3d72aec369	Roll pdfjs-dist	2023-12-12 02:58:06 -08:00
Enrico Ros	d244ee2cca	Update Docker image workflow. Assume the vX.Y.Z is the latest (and will have the latest tag). Removing this to remove the 'stable' tag, as latest is better. The 'main' branch keeps the development tag.	2023-12-12 01:38:57 -08:00
Enrico Ros	cc8a235ae3	Bits	2023-12-12 01:21:43 -08:00
Enrico Ros	ae348812de	OpenRouter: improve showing of discounted models	2023-12-12 01:14:33 -08:00
Enrico Ros	6053636f66	OpenRouter: OAuth login support	2023-12-11 22:35:40 -08:00
Enrico Ros	f2e2aee672	1.7.2: Stable Patch Version	2023-12-11 21:22:31 -08:00
Enrico Ros	11cbb2bbf0	OpenRouter: update models	2023-12-11 21:21:22 -08:00
Enrico Ros	30bd19d6ce	HTML Table to Markdown Table: improve reliability and ignore hidden data	2023-12-11 20:46:34 -08:00
Enrico Ros	d0b5c02062	Improve how Stream errors are shown	2023-12-11 18:22:15 -08:00
Enrico Ros	771192e406	Ollama: support ollama errors via API	2023-12-11 18:19:38 -08:00
Enrico Ros	13f502bd76	1.7.1: Release (Ollama chat). #270	2023-12-10 22:17:35 -08:00
Enrico Ros	11055b12ca	Ollama: use the new Chat endpoint. Closes #270	2023-12-10 22:12:51 -08:00
Enrico Ros	d0ea96eec0	Ollama: Admin: optional sort by Pulls, and UI link to the Model page	2023-12-10 22:03:55 -08:00
Enrico Ros	02eafc03f1	Ollama: update models, and sort by Featured	2023-12-10 22:01:50 -08:00
Enrico Ros	33d07a0313	Ollama: update documentation	2023-12-10 21:30:30 -08:00
Enrico Ros	763b852148	Ollama: administration: external link	2023-12-10 20:24:20 -08:00
Enrico Ros	d5b0617fd7	Comment for now	2023-12-10 06:14:49 -08:00
Enrico Ros	e3ce83674c	Update Ollama	2023-12-10 06:09:54 -08:00