mirror of
https://github.com/enricoros/big-AGI.git
synced 2026-05-10 21:50:14 -07:00
Compare commits
88 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 0fc83cf6f5 | |||
| 2949feccd5 | |||
| d6f1c2da81 | |||
| fabb433fde | |||
| b57445eb14 | |||
| 5f8f4aba78 | |||
| d693cdaeba | |||
| 39fbcfd97b | |||
| 7694bc3d52 | |||
| 7f21b2ac3d | |||
| fdb66da1a7 | |||
| 6b62a6733b | |||
| 5d62056807 | |||
| efff7126af | |||
| 45046c70ed | |||
| 7b5b852793 | |||
| 9952b757b8 | |||
| b08ecc9012 | |||
| bc5a38fa89 | |||
| bee49a4b1c | |||
| 0ece1ce58c | |||
| fd897b55b2 | |||
| dd41a402d0 | |||
| 3f9defd18c | |||
| 49c77f5a10 | |||
| 6b2bfa6060 | |||
| 8e3f247bfb | |||
| 201e3a7252 | |||
| 044ed4df79 | |||
| 0df7297cca | |||
| 453a3e5751 | |||
| 34c1c425b9 | |||
| e0a010189f | |||
| 7a07f10ed1 | |||
| 33cb2b84b2 | |||
| 3adec85e1f | |||
| 18cfe5e296 | |||
| 566ba366b4 | |||
| 7ed653b315 | |||
| cb333c33d7 | |||
| 22ba37074b | |||
| 84d7b7644a | |||
| 71445dafc8 | |||
| 66a5ad7f00 | |||
| 09f80adfaa | |||
| 9febd97065 | |||
| 5219f9928d | |||
| aec9f4665f | |||
| db48465204 | |||
| c2c858730a | |||
| 402bde9a81 | |||
| ba1c0ba0d9 | |||
| 084d77cd78 | |||
| 30c17a9b73 | |||
| 2442463da3 | |||
| 84a3e8cfdb | |||
| 6ae440d252 | |||
| c0c724afc1 | |||
| a265112ce1 | |||
| 75605ed408 | |||
| ad38ff4157 | |||
| 08c60e53b1 | |||
| d0dcb2ac02 | |||
| fbeb604b26 | |||
| c4f3b1df77 | |||
| 5a1f9caaac | |||
| 2fc70d5e95 | |||
| 43adadef78 | |||
| 96f6e7628b | |||
| 32ad82bcee | |||
| 3d72aec369 | |||
| d244ee2cca | |||
| cc8a235ae3 | |||
| ae348812de | |||
| 6053636f66 | |||
| f2e2aee672 | |||
| 11cbb2bbf0 | |||
| 30bd19d6ce | |||
| d0b5c02062 | |||
| 771192e406 | |||
| 13f502bd76 | |||
| 11055b12ca | |||
| d0ea96eec0 | |||
| 02eafc03f1 | |||
| 33d07a0313 | |||
| 763b852148 | |||
| d5b0617fd7 | |||
| e3ce83674c |
@@ -65,7 +65,11 @@ I need the following from you:
|
||||
|
||||
### GitHub release
|
||||
|
||||
Now paste the former release (or 1.5.0 which was accurate and great), including the new contributors and
|
||||
```markdown
|
||||
Please create the 1.2.3 Release Notes for GitHub. The following were the Release Notes for 1.1.0. Use a truthful and honest tone, undestanding that people's time and attention span is short. Today is 2023-12-20.
|
||||
```
|
||||
|
||||
Now paste-attachment the former release notes (or 1.5.0 which was accurate and great), including the new contributors and
|
||||
some stats (# of commits, etc.), and roll it for the new release.
|
||||
|
||||
### Discord announcement
|
||||
|
||||
@@ -13,7 +13,7 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- main-stable # Trigger on pushes to the main-stable branch
|
||||
#- main-stable # Disabled as the v* tag is used for stable releases
|
||||
tags:
|
||||
- 'v*' # Trigger on version tags (e.g., v1.7.0)
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
# BIG-AGI 🧠✨
|
||||
|
||||
Welcome to big-AGI 👋, the GPT application for professionals that need form, function,
|
||||
simplicity, and speed. Powered by the latest models from 7 vendors, including
|
||||
open-source, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
|
||||
Welcome to big-AGI 👋, the GPT application for professionals that need function, form,
|
||||
simplicity, and speed. Powered by the latest models from 8 vendors and
|
||||
open-source model servers, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
|
||||
visualizations, coding, drawing, calling, and quite more -- all in a polished UX.
|
||||
|
||||
Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
|
||||
@@ -11,7 +11,7 @@ Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
|
||||
|
||||
Or fork & run on Vercel
|
||||
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
|
||||
|
||||
## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2)
|
||||
|
||||
@@ -21,7 +21,19 @@ shows the current developments and future ideas.
|
||||
- Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
|
||||
- Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_
|
||||
|
||||
### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
|
||||
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
|
||||
|
||||
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
|
||||
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
|
||||
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- Mac Shortcuts Fix: Improved UX on Mac
|
||||
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
|
||||
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
|
||||
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
|
||||
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
|
||||
|
||||
### What's New in 1.7.0 · Dec 11, 2023
|
||||
|
||||
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
|
||||
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
|
||||
@@ -145,7 +157,7 @@ Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare
|
||||
|
||||
Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.
|
||||
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
|
||||
|
||||
## Integrations:
|
||||
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
export const runtime = 'edge';
|
||||
export { openaiStreamingRelayHandler as POST } from '~/modules/llms/transports/server/openai/openai.streaming';
|
||||
export { llmStreamingRelayHandler as POST } from '~/modules/llms/server/llm.server.streaming';
|
||||
+1
-1
@@ -6,7 +6,7 @@ version: '3.9'
|
||||
|
||||
services:
|
||||
big-agi:
|
||||
image: ghcr.io/enricoros/big-agi:main
|
||||
image: ghcr.io/enricoros/big-agi:latest
|
||||
ports:
|
||||
- "3000:3000"
|
||||
env_file:
|
||||
|
||||
+15
-3
@@ -5,12 +5,24 @@ by release.
|
||||
|
||||
- For the live roadmap, please see [the GitHub project](https://github.com/users/enricoros/projects/4/views/2)
|
||||
|
||||
### 1.8.0 - Dec 2023
|
||||
### 1.9.0 - Dec 2023
|
||||
|
||||
- work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
|
||||
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
|
||||
- milestone: [1.9.0](https://github.com/enricoros/big-agi/milestone/9)
|
||||
|
||||
### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
|
||||
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
|
||||
|
||||
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
|
||||
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
|
||||
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- Mac Shortcuts Fix: Improved UX on Mac
|
||||
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
|
||||
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
|
||||
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
|
||||
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
|
||||
|
||||
### What's New in 1.7.0 · Dec 11, 2023 · Attachment Theory
|
||||
|
||||
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
|
||||
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
|
||||
|
||||
@@ -30,5 +30,5 @@ For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/ba
|
||||
|
||||
> NOTE: LocalAI does not list details about the mdoels. Every model is assumed to be
|
||||
> capable of chatting, and with a context window of 4096 tokens.
|
||||
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/transports/server/openai/models.data.ts)
|
||||
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/server/openai/models.data.ts)
|
||||
> file with the mapping information between LocalAI model IDs and names/descriptions/tokens, etc.
|
||||
|
||||
+33
-16
@@ -5,31 +5,46 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
|
||||
experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
|
||||
editing tools, models switching, personas, and more.
|
||||
|
||||
_Last updated Dec 16, 2023_
|
||||
|
||||

|
||||
|
||||
## Quick Integration Guide
|
||||
|
||||
1. **Ensure Ollama API Server is Running**: Before starting, make sure your Ollama API server is up and running.
|
||||
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**.
|
||||
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`).
|
||||
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models.
|
||||
5. **Start Using AI Personas**: Select an Ollama model and begin interacting with AI personas tailored to your needs.
|
||||
1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
|
||||
- For detailed instructions on setting up the Ollama API server, please refer to the
|
||||
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
|
||||
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
|
||||
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
|
||||
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
|
||||
> Optional: use the Ollama Admin interface to see which models are available and 'Pull' them in your local machine. Note
|
||||
that this operation will likely timeout due to Edge Functions timeout on the big-AGI server while pulling, and
|
||||
you'll have to press the 'Pull' button again, until a green message appears.
|
||||
5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas
|
||||
|
||||
### Ollama: installation and Setup
|
||||
**Visual Configuration Guide**:
|
||||
|
||||
For detailed instructions on setting up the Ollama API server, please refer to the
|
||||
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
|
||||
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:<br/>
|
||||
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" width="320">
|
||||
|
||||
### Visual Guide
|
||||
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:<br/>
|
||||
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" width="320">
|
||||
|
||||
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:
|
||||
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" style="max-width: 320px;">
|
||||
* You can now switch model/persona dynamically and text/voice chat with the models:<br/>
|
||||
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" width="320">
|
||||
|
||||
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:
|
||||
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" style="max-width: 320px;">
|
||||
<br/>
|
||||
|
||||
* You can now switch model/persona dynamically and text/voice chat with the models:
|
||||
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" style="max-width: 320px;">
|
||||
### ⚠️ Network Troubleshooting
|
||||
|
||||
If you get errors about the server having trouble connecting with Ollama, please see
|
||||
[this message](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483) on Issue #276.
|
||||
|
||||
And in brief, make sure the Ollama endpoint is accessible from the servers where you run big-AGI (which could
|
||||
be localhost or cloud servers).
|
||||

|
||||
|
||||
<br/>
|
||||
|
||||
### Advanced: Model parameters
|
||||
|
||||
@@ -68,6 +83,8 @@ Then, edit the nginx configuration file `/etc/nginx/sites-enabled/default` and a
|
||||
|
||||
Reach out to our community if you need help with this.
|
||||
|
||||
<br/>
|
||||
|
||||
### Community and Support
|
||||
|
||||
Join our community to share your experiences, get help, and discuss best practices:
|
||||
@@ -78,4 +95,4 @@ Join our community to share your experiences, get help, and discuss best practic
|
||||
---
|
||||
|
||||
`big-AGI` is committed to providing a powerful, intuitive, and privacy-respecting AI experience.
|
||||
We are excited for you to explore the possibilities with Ollama models. Happy creating!
|
||||
We are excited for you to explore the possibilities with Ollama models. Happy creating!
|
||||
|
||||
+37
-20
@@ -21,33 +21,23 @@ Docker ensures faster development cycles, easier collaboration, and seamless env
|
||||
```
|
||||
4. Browse to [http://localhost:3000](http://localhost:3000)
|
||||
|
||||
## Documentation
|
||||
<br/>
|
||||
|
||||
The big-AGI repository includes a Dockerfile and a GitHub Actions workflow for building and publishing a
|
||||
Docker image of the application.
|
||||
## Run Official Containers 📦
|
||||
|
||||
### Dockerfile
|
||||
`big-AGI` is pre-built from source code and published as a Docker image on the GitHub Container Registry (ghcr).
|
||||
The build process is transparent, and happens via GitHub Actions, as described in the
|
||||
file.
|
||||
|
||||
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
|
||||
installs dependencies, and creates a production-ready version of the application as a local container.
|
||||
### Official Images: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
|
||||
|
||||
### Official container images
|
||||
#### Run using *docker* 🚀
|
||||
|
||||
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file automates the
|
||||
building and publishing of the Docker images to the GitHub Container Registry (ghcr) when changes are
|
||||
pushed to the `main` branch.
|
||||
|
||||
Official pre-built containers: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
|
||||
|
||||
Run official pre-built containers:
|
||||
```bash
|
||||
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
|
||||
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi:latest
|
||||
```
|
||||
|
||||
### Run official containers
|
||||
|
||||
In addition, the repository also includes a `docker-compose.yaml` file, configured to run the pre-built
|
||||
'ghcr image'. This file is used to define the `big-agi` service, the ports to expose, and the command to run.
|
||||
#### Run using *docker-compose* 🚀
|
||||
|
||||
If you have Docker Compose installed, you can run the Docker container with `docker-compose up`
|
||||
to pull the Docker image (if it hasn't been pulled already) and start a Docker container. If you want to
|
||||
@@ -57,4 +47,31 @@ update the image to the latest version, you can run `docker-compose pull` before
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment.
|
||||
### Make Local Services Visible to Docker 🌐
|
||||
|
||||
To make local services running on your host machine accessible to a Docker container, such as a
|
||||
[Browseless](./config-browse.md) service or a local API, you can follow this simplified guide:
|
||||
|
||||
| Operating System | Steps to Make Local Services Visible to Docker |
|
||||
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| Windows and macOS | Use the special DNS name `host.docker.internal` to refer to the host machine from within the Docker container. No additional network configuration is required. Access local services using `host.docker.internal:<PORT>`. |
|
||||
| Linux | Two options: *A*. Use <ins>--network="host"</ins> (`docker run --network="host" -d big-agi`) when running the Docker container to merge the container within the host network stack; however, this reduces container isolation. Alternatively: *B*. Connect to local services <ins>using the host's IP address</ins> directly, as host.docker.internal is not available by default on Linux. |
|
||||
|
||||
<br/>
|
||||
|
||||
### More Information
|
||||
|
||||
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
|
||||
installs dependencies, and creates a production-ready version of the application as a local container.
|
||||
|
||||
The [`docker-compose.yaml`](../docker-compose.yaml) file is configured to run the
|
||||
official image (big-agi:latest). This file is used to define the `big-agi` service, to expose
|
||||
port 3000 on the host, and launch big-AGI within the container (startup command).
|
||||
|
||||
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file is used
|
||||
to build the Official Docker images and publish them to the GitHub Container Registry (ghcr).
|
||||
The build process is transparent and happens via GitHub Actions.
|
||||
|
||||
<br/>
|
||||
|
||||
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment!
|
||||
@@ -12,7 +12,7 @@ version: '3.9'
|
||||
|
||||
services:
|
||||
big-agi:
|
||||
image: ghcr.io/enricoros/big-agi:main
|
||||
image: ghcr.io/enricoros/big-agi:latest
|
||||
ports:
|
||||
- "3000:3000"
|
||||
env_file:
|
||||
|
||||
@@ -24,6 +24,8 @@ AZURE_OPENAI_API_ENDPOINT=
|
||||
AZURE_OPENAI_API_KEY=
|
||||
ANTHROPIC_API_KEY=
|
||||
ANTHROPIC_API_HOST=
|
||||
GEMINI_API_KEY=
|
||||
MISTRAL_API_KEY=
|
||||
OLLAMA_API_HOST=
|
||||
OPENROUTER_API_KEY=
|
||||
|
||||
@@ -45,7 +47,7 @@ PUPPETEER_WSS_ENDPOINT=
|
||||
# Backend Analytics
|
||||
BACKEND_ANALYTICS=
|
||||
|
||||
# Backend HTTP Basic Authentication
|
||||
# Backend HTTP Basic Authentication (see `deploy-authentication.md` for turning on authentication)
|
||||
HTTP_BASIC_AUTH_USERNAME=
|
||||
HTTP_BASIC_AUTH_PASSWORD=
|
||||
```
|
||||
@@ -79,6 +81,8 @@ requiring the user to enter an API key
|
||||
| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md) | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
|
||||
| `ANTHROPIC_API_KEY` | The API key for Anthropic | Optional |
|
||||
| `ANTHROPIC_API_HOST` | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional |
|
||||
| `GEMINI_API_KEY` | The API key for Google AI's Gemini | Optional |
|
||||
| `MISTRAL_API_KEY` | The API key for Mistral | Optional |
|
||||
| `OLLAMA_API_HOST` | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md) | |
|
||||
| `OPENROUTER_API_KEY` | The API key for OpenRouter | Optional |
|
||||
|
||||
@@ -113,10 +117,7 @@ Enable the app to Talk, Draw, and Google things up.
|
||||
| `PUPPETEER_WSS_ENDPOINT` | Puppeteer WebSocket endpoint - used for browsing, etc. |
|
||||
| **Backend** | |
|
||||
| `BACKEND_ANALYTICS` | Semicolon-separated list of analytics flags (see backend.analytics.ts). Flags: `domain` logs the responding domain. |
|
||||
| `HTTP_BASIC_AUTH_USERNAME` | Username for HTTP Basic Authentication. See the [Authentication](deploy-authentication.md) guide. |
|
||||
| `HTTP_BASIC_AUTH_USERNAME` | See the [Authentication](deploy-authentication.md) guide. Username for HTTP Basic Authentication. |
|
||||
| `HTTP_BASIC_AUTH_PASSWORD` | Password for HTTP Basic Authentication. |
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 79 KiB |
Generated
+494
-256
File diff suppressed because it is too large
Load Diff
+17
-17
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "big-agi",
|
||||
"version": "1.7.0",
|
||||
"version": "1.8.0",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"dev": "next dev",
|
||||
@@ -18,13 +18,13 @@
|
||||
"@emotion/react": "^11.11.1",
|
||||
"@emotion/server": "^11.11.0",
|
||||
"@emotion/styled": "^11.11.0",
|
||||
"@mui/icons-material": "^5.14.18",
|
||||
"@mui/joy": "^5.0.0-beta.15",
|
||||
"@next/bundle-analyzer": "^14.0.3",
|
||||
"@prisma/client": "^5.6.0",
|
||||
"@mui/icons-material": "^5.15.0",
|
||||
"@mui/joy": "^5.0.0-beta.18",
|
||||
"@next/bundle-analyzer": "^14.0.4",
|
||||
"@prisma/client": "^5.7.0",
|
||||
"@sanity/diff-match-patch": "^3.1.1",
|
||||
"@t3-oss/env-nextjs": "^0.7.1",
|
||||
"@tanstack/react-query": "^4.36.1",
|
||||
"@tanstack/react-query": "~4.36.1",
|
||||
"@trpc/client": "^10.44.1",
|
||||
"@trpc/next": "^10.44.1",
|
||||
"@trpc/react-query": "^10.44.1",
|
||||
@@ -33,8 +33,8 @@
|
||||
"browser-fs-access": "^0.35.0",
|
||||
"eventsource-parser": "^1.1.1",
|
||||
"idb-keyval": "^6.2.1",
|
||||
"next": "^14.0.3",
|
||||
"pdfjs-dist": "4.0.189",
|
||||
"next": "^14.0.4",
|
||||
"pdfjs-dist": "4.0.269",
|
||||
"plantuml-encoder": "^1.4.0",
|
||||
"prismjs": "^1.29.0",
|
||||
"react": "^18.2.0",
|
||||
@@ -47,23 +47,23 @@
|
||||
"tesseract.js": "^5.0.3",
|
||||
"uuid": "^9.0.1",
|
||||
"zod": "^3.22.4",
|
||||
"zustand": "~4.3.9"
|
||||
"zustand": "^4.4.7"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@cloudflare/puppeteer": "^0.0.5",
|
||||
"@types/node": "^20.10.0",
|
||||
"@types/node": "^20.10.4",
|
||||
"@types/plantuml-encoder": "^1.4.2",
|
||||
"@types/prismjs": "^1.26.3",
|
||||
"@types/react": "^18.2.38",
|
||||
"@types/react": "^18.2.45",
|
||||
"@types/react-dom": "^18.2.17",
|
||||
"@types/react-katex": "^3.0.3",
|
||||
"@types/react-katex": "^3.0.4",
|
||||
"@types/react-timeago": "^4.1.6",
|
||||
"@types/uuid": "^9.0.7",
|
||||
"eslint": "^8.54.0",
|
||||
"eslint-config-next": "^14.0.3",
|
||||
"prettier": "^3.1.0",
|
||||
"prisma": "^5.6.0",
|
||||
"typescript": "^5.3.2"
|
||||
"eslint": "^8.55.0",
|
||||
"eslint-config-next": "^14.0.4",
|
||||
"prettier": "^3.1.1",
|
||||
"prisma": "^5.7.0",
|
||||
"typescript": "^5.3.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": "^20.0.0 || ^18.0.0"
|
||||
|
||||
+10
-7
@@ -11,6 +11,7 @@ import '~/common/styles/CodePrism.css';
|
||||
import '~/common/styles/GithubMarkdown.css';
|
||||
|
||||
import { ProviderBackend } from '~/common/state/ProviderBackend';
|
||||
import { ProviderSingleTab } from '~/common/state/ProviderSingleTab';
|
||||
import { ProviderSnacks } from '~/common/state/ProviderSnacks';
|
||||
import { ProviderTRPCQueryClient } from '~/common/state/ProviderTRPCQueryClient';
|
||||
import { ProviderTheming } from '~/common/state/ProviderTheming';
|
||||
@@ -25,13 +26,15 @@ const MyApp = ({ Component, emotionCache, pageProps }: MyAppProps) =>
|
||||
</Head>
|
||||
|
||||
<ProviderTheming emotionCache={emotionCache}>
|
||||
<ProviderTRPCQueryClient>
|
||||
<ProviderSnacks>
|
||||
<ProviderBackend>
|
||||
<Component {...pageProps} />
|
||||
</ProviderBackend>
|
||||
</ProviderSnacks>
|
||||
</ProviderTRPCQueryClient>
|
||||
<ProviderSingleTab>
|
||||
<ProviderTRPCQueryClient>
|
||||
<ProviderSnacks>
|
||||
<ProviderBackend>
|
||||
<Component {...pageProps} />
|
||||
</ProviderBackend>
|
||||
</ProviderSnacks>
|
||||
</ProviderTRPCQueryClient>
|
||||
</ProviderSingleTab>
|
||||
</ProviderTheming>
|
||||
|
||||
<VercelAnalytics debug={false} />
|
||||
|
||||
@@ -0,0 +1,98 @@
|
||||
import * as React from 'react';
|
||||
import { useRouter } from 'next/router';
|
||||
|
||||
import { Box, Typography } from '@mui/joy';
|
||||
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
import { AppLayout } from '~/common/layout/AppLayout';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { navigateToIndex } from '~/common/app.routes';
|
||||
import { openLayoutModelsSetup } from '~/common/layout/store-applayout';
|
||||
|
||||
|
||||
function CallbackOpenRouterPage(props: { openRouterCode: string | undefined }) {
|
||||
|
||||
// external state
|
||||
const { data, isError, error, isLoading } = apiQuery.backend.exchangeOpenRouterKey.useQuery({ code: props.openRouterCode || '' }, {
|
||||
enabled: !!props.openRouterCode,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
|
||||
// derived state
|
||||
const isErrorInput = !props.openRouterCode;
|
||||
const openRouterKey = data?.key ?? undefined;
|
||||
const isSuccess = !!openRouterKey;
|
||||
|
||||
|
||||
// Success: save the key and redirect to the chat app
|
||||
React.useEffect(() => {
|
||||
if (!isSuccess)
|
||||
return;
|
||||
|
||||
// 1. Save the key as the client key
|
||||
useModelsStore.getState().setOpenRoutersKey(openRouterKey);
|
||||
|
||||
// 2. Navigate to the chat app
|
||||
navigateToIndex(true).then(() => openLayoutModelsSetup());
|
||||
|
||||
}, [isSuccess, openRouterKey]);
|
||||
|
||||
return (
|
||||
<Box sx={{
|
||||
flexGrow: 1,
|
||||
backgroundColor: 'background.level1',
|
||||
overflowY: 'auto',
|
||||
display: 'flex', justifyContent: 'center',
|
||||
p: { xs: 3, md: 6 },
|
||||
}}>
|
||||
|
||||
<Box sx={{
|
||||
// my: 'auto',
|
||||
display: 'flex', flexDirection: 'column', alignItems: 'center',
|
||||
gap: 4,
|
||||
}}>
|
||||
|
||||
<Typography level='title-lg'>
|
||||
Welcome Back
|
||||
</Typography>
|
||||
|
||||
{isLoading && <Typography level='body-sm'>Loading...</Typography>}
|
||||
|
||||
{isErrorInput && <InlineError error='There was an issue retrieving the code from OpenRouter.' />}
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
{data && (
|
||||
<Typography level='body-md'>
|
||||
Success! You can now close this window.
|
||||
</Typography>
|
||||
)}
|
||||
|
||||
</Box>
|
||||
|
||||
</Box>
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* This page will be invoked by OpenRouter as a Callback
|
||||
*
|
||||
* Docs: https://openrouter.ai/docs#oauth
|
||||
* Example URL: https://localhost:3000/link/callback_openrouter?code=SomeCode
|
||||
*/
|
||||
export default function Page() {
|
||||
|
||||
// get the 'code=...' from the URL
|
||||
const { query } = useRouter();
|
||||
const { code: openRouterCode } = query;
|
||||
|
||||
return (
|
||||
<AppLayout suspendAutoModelsSetup>
|
||||
<CallbackOpenRouterPage openRouterCode={openRouterCode as (string | undefined)} />
|
||||
</AppLayout>
|
||||
);
|
||||
}
|
||||
Vendored
-21
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -15,8 +15,7 @@ import { useChatLLMDropdown } from '../chat/components/applayout/useLLMDropdown'
|
||||
|
||||
import { EXPERIMENTAL_speakTextStream } from '~/modules/elevenlabs/elevenlabs.client';
|
||||
import { SystemPurposeId, SystemPurposes } from '../../data';
|
||||
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
import { useElevenLabsVoiceDropdown } from '~/modules/elevenlabs/useElevenLabsVoiceDropdown';
|
||||
|
||||
import { Link } from '~/common/components/Link';
|
||||
@@ -216,7 +215,7 @@ export function CallUI(props: {
|
||||
responseAbortController.current = new AbortController();
|
||||
let finalText = '';
|
||||
let error: any | null = null;
|
||||
streamChat(chatLLMId, callPrompt, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
|
||||
llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
|
||||
const text = updatedMessage.text?.trim();
|
||||
if (text) {
|
||||
finalText = text;
|
||||
|
||||
@@ -3,7 +3,7 @@ import * as React from 'react';
|
||||
import { Chip, ColorPaletteProp, VariantProp } from '@mui/joy';
|
||||
import { SxProps } from '@mui/joy/styles/types';
|
||||
|
||||
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import type { VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export function CallMessage(props: {
|
||||
|
||||
@@ -23,14 +23,26 @@ function AppBarLLMDropdown(props: {
|
||||
const llmItems: DropdownItems = {};
|
||||
let prevSourceId: DModelSourceId | null = null;
|
||||
for (const llm of props.llms) {
|
||||
if (!llm.hidden || llm.id === props.chatLlmId) {
|
||||
if (!prevSourceId || llm.sId !== prevSourceId) {
|
||||
if (prevSourceId)
|
||||
llmItems[`sep-${llm.id}`] = { type: 'separator', title: llm.sId };
|
||||
prevSourceId = llm.sId;
|
||||
}
|
||||
llmItems[llm.id] = { title: llm.label };
|
||||
|
||||
// filter-out hidden models
|
||||
if (!(!llm.hidden || llm.id === props.chatLlmId))
|
||||
continue;
|
||||
|
||||
// add separators when changing sources
|
||||
if (!prevSourceId || llm.sId !== prevSourceId) {
|
||||
if (prevSourceId)
|
||||
llmItems[`sep-${llm.id}`] = {
|
||||
type: 'separator',
|
||||
title: llm.sId,
|
||||
};
|
||||
prevSourceId = llm.sId;
|
||||
}
|
||||
|
||||
// add the model item
|
||||
llmItems[llm.id] = {
|
||||
title: llm.label,
|
||||
// icon: llm.id.startsWith('some vendor') ? <VendorIcon /> : undefined,
|
||||
};
|
||||
}
|
||||
|
||||
const handleChatLLMChange = (_event: any, value: DLLMId | null) => value && props.setChatLlmId(value);
|
||||
|
||||
@@ -331,7 +331,8 @@ export function Composer(props: {
|
||||
|
||||
const handleOverlayDragOver = React.useCallback((e: React.DragEvent) => {
|
||||
eatDragEvent(e);
|
||||
// e.dataTransfer.dropEffect = 'copy';
|
||||
// this makes sure we don't "transfer" (or move) the attachment, but we tell the sender we'll copy it
|
||||
e.dataTransfer.dropEffect = 'copy';
|
||||
}, [eatDragEvent]);
|
||||
|
||||
const handleOverlayDrop = React.useCallback(async (event: React.DragEvent) => {
|
||||
|
||||
@@ -254,7 +254,7 @@ export async function attachmentPerformConversion(attachment: Readonly<Attachmen
|
||||
case 'rich-text-table':
|
||||
let mdTable: string;
|
||||
try {
|
||||
mdTable = htmlTableToMarkdown(input.altData!);
|
||||
mdTable = htmlTableToMarkdown(input.altData!, false);
|
||||
} catch (error) {
|
||||
// fallback to text/plain
|
||||
mdTable = inputDataToString(input.data);
|
||||
|
||||
@@ -167,6 +167,8 @@ function explainErrorInMessage(text: string, isAssistant: boolean, modelId?: str
|
||||
make sure the usage is under <Link noLinkStyle href='https://platform.openai.com/account/billing/limits' target='_blank'>the limits</Link>.
|
||||
</>;
|
||||
}
|
||||
// else
|
||||
// errorMessage = <>{text || 'Unknown error'}</>;
|
||||
|
||||
return { errorMessage, isAssistantError };
|
||||
}
|
||||
|
||||
@@ -2,8 +2,8 @@ import { DLLMId } from '~/modules/llms/store-llms';
|
||||
import { SystemPurposeId } from '../../../data';
|
||||
import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
|
||||
import { autoTitle } from '~/modules/aifn/autotitle/autoTitle';
|
||||
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
|
||||
import { speakText } from '~/modules/elevenlabs/elevenlabs.client';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
|
||||
import { DMessage, useChatStore } from '~/common/state/store-chats';
|
||||
|
||||
@@ -63,7 +63,7 @@ async function streamAssistantMessage(
|
||||
const messages = history.map(({ role, text }) => ({ role, content: text }));
|
||||
|
||||
try {
|
||||
await streamChat(llmId, messages, abortSignal,
|
||||
await llmStreamingChatGenerate(llmId, messages, null, null, abortSignal,
|
||||
(updatedMessage: Partial<DMessage>) => {
|
||||
// update the message in the store (and thus schedule a re-render)
|
||||
editMessage(updatedMessage);
|
||||
|
||||
@@ -78,14 +78,14 @@ export function AppNews() {
|
||||
|
||||
{!!news && <Container disableGutters maxWidth='sm'>
|
||||
{news?.map((ni, idx) => {
|
||||
const firstCard = idx === 0;
|
||||
// const firstCard = idx === 0;
|
||||
const hasCardAfter = news.length < NewsItems.length;
|
||||
const showExpander = hasCardAfter && (idx === news.length - 1);
|
||||
const addPadding = false; //!firstCard; // || showExpander;
|
||||
return <Card key={'news-' + idx} sx={{ mb: 2, minHeight: 32 }}>
|
||||
<CardContent sx={{ position: 'relative', pr: addPadding ? 4 : 0 }}>
|
||||
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 1 }}>
|
||||
<GoodTooltip title={ni.versionName || null} placement='top-start'>
|
||||
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 0 }}>
|
||||
<GoodTooltip title={ni.versionName ? `${ni.versionName} ${ni.versionMoji || ''}` : null} placement='top-start'>
|
||||
<Typography level='title-sm' component='div' sx={{ flexGrow: 1 }}>
|
||||
{ni.text ? ni.text : ni.versionName ? `${ni.versionCode} · ${ni.versionName}` : `Version ${ni.versionCode}:`}
|
||||
</Typography>
|
||||
|
||||
@@ -10,10 +10,10 @@ import { platformAwareKeystrokes } from '~/common/components/KeyStroke';
|
||||
|
||||
|
||||
// update this variable every time you want to broadcast a new version to clients
|
||||
export const incrementalVersion: number = 8;
|
||||
export const incrementalVersion: number = 9;
|
||||
|
||||
const B = (props: { href?: string, children: React.ReactNode }) => {
|
||||
const boldText = <Typography color={!!props.href ? 'primary' : 'warning'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
|
||||
const boldText = <Typography color={!!props.href ? 'primary' : 'neutral'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
|
||||
return props.href ?
|
||||
<Link href={props.href + clientUtmSource()} target='_blank' sx={{ /*textDecoration: 'underline'*/ }}>{boldText} <LaunchIcon sx={{ ml: 1 }} /></Link> :
|
||||
boldText;
|
||||
@@ -27,11 +27,12 @@ const RIssues = `${OpenRepo}/issues`;
|
||||
export const newsCallout =
|
||||
<Card>
|
||||
<CardContent sx={{ gap: 2 }}>
|
||||
<Typography level='h4'>
|
||||
<Typography level='title-lg'>
|
||||
Open Roadmap
|
||||
</Typography>
|
||||
<Typography>
|
||||
The roadmap is officially out. For the first time you get a look at what's brewing, up and coming, and get a chance to pick up cool features!
|
||||
<Typography level='body-md'>
|
||||
Take a peek at our roadmap to see what's in the pipeline.
|
||||
Discover upcoming features and let us know what excites you the most!
|
||||
</Typography>
|
||||
<Grid container spacing={1}>
|
||||
<Grid xs={12} sm={7}>
|
||||
@@ -39,7 +40,7 @@ export const newsCallout =
|
||||
fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
|
||||
component={Link} href={OpenProject} noLinkStyle target='_blank'
|
||||
>
|
||||
Explore the Roadmap
|
||||
Explore
|
||||
</Button>
|
||||
</Grid>
|
||||
<Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
|
||||
@@ -66,10 +67,28 @@ export const NewsItems: NewsItem[] = [
|
||||
// phone calls
|
||||
],
|
||||
},*/
|
||||
{
|
||||
versionCode: '1.8.0',
|
||||
versionName: 'To The Moon And Back',
|
||||
versionMoji: '🚀🌕🔙❤️',
|
||||
versionDate: new Date('2023-12-20T09:30:00Z'),
|
||||
items: [
|
||||
{ text: <><B href={RIssues + '/275'}>Google Gemini</B> models support</> },
|
||||
{ text: <><B href={RIssues + '/273'}>Mistral Platform</B> support</> },
|
||||
{ text: <><B href={RIssues + '/270'}>Ollama chats</B> perfection</> },
|
||||
{ text: <>Custom <B href={RIssues + '/280'}>diagrams instructions</B> (@joriskalz)</> },
|
||||
{ text: <><B>Single-Tab</B> mode, enhances data integrity and prevents DB corruption</> },
|
||||
{ text: <>Updated Ollama (v0.1.17) and OpenRouter models</> },
|
||||
{ text: <>More: fixed ⌘ shortcuts on Mac</> },
|
||||
{ text: <><Link href='https://big-agi.com'>Website</Link>: official downloads</> },
|
||||
{ text: <>Easier Vercel deployment, documented <Link href='https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483'>network troubleshooting</Link></>, dev: true },
|
||||
],
|
||||
},
|
||||
{
|
||||
versionCode: '1.7.0',
|
||||
versionName: 'Attachment Theory',
|
||||
versionDate: new Date('2023-12-10T12:00:00Z'), // new Date().toISOString()
|
||||
// versionDate: new Date('2023-12-11T06:00:00Z'), // 1.7.3
|
||||
versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
|
||||
items: [
|
||||
{ text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
|
||||
{ text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
|
||||
@@ -158,6 +177,7 @@ export const NewsItems: NewsItem[] = [
|
||||
interface NewsItem {
|
||||
versionCode: string;
|
||||
versionName?: string;
|
||||
versionMoji?: string;
|
||||
versionDate?: Date;
|
||||
text?: string | React.JSX.Element;
|
||||
items?: {
|
||||
|
||||
@@ -1,14 +1,13 @@
|
||||
import * as React from 'react';
|
||||
import { shallow } from 'zustand/shallow';
|
||||
import { useRouter } from 'next/router';
|
||||
|
||||
import { navigateToNews } from '~/common/app.routes';
|
||||
import { useAppStateStore } from '~/common/state/store-appstate';
|
||||
|
||||
import { incrementalVersion } from './news.data';
|
||||
|
||||
|
||||
export function useShowNewsOnUpdate() {
|
||||
const { push: routerPush } = useRouter();
|
||||
const { usageCount, lastSeenNewsVersion } = useAppStateStore(state => ({
|
||||
usageCount: state.usageCount,
|
||||
lastSeenNewsVersion: state.lastSeenNewsVersion,
|
||||
@@ -17,9 +16,9 @@ export function useShowNewsOnUpdate() {
|
||||
const isNewsOutdated = (lastSeenNewsVersion || 0) < incrementalVersion;
|
||||
if (isNewsOutdated && usageCount > 2) {
|
||||
// Disable for now
|
||||
void routerPush('/news');
|
||||
void navigateToNews();
|
||||
}
|
||||
}, [lastSeenNewsVersion, routerPush, usageCount]);
|
||||
}, [lastSeenNewsVersion, usageCount]);
|
||||
}
|
||||
|
||||
export function useMarkNewsAsSeen() {
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export interface LLMChainStep {
|
||||
@@ -80,7 +80,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
|
||||
_chainAbortController.signal.addEventListener('abort', globalToStepListener);
|
||||
|
||||
// LLM call
|
||||
callChatGenerate(llmId, llmChatInput, chain.overrideResponseTokens)
|
||||
llmChatGenerateOrThrow(llmId, llmChatInput, null, null, chain.overrideResponseTokens)
|
||||
.then(({ content }) => {
|
||||
stepDone = true;
|
||||
if (!stepAbortController.signal.aborted)
|
||||
|
||||
@@ -7,21 +7,37 @@
|
||||
import Router from 'next/router';
|
||||
|
||||
import type { DConversationId } from '~/common/state/store-chats';
|
||||
import { isBrowser } from './util/pwaUtils';
|
||||
|
||||
|
||||
export const ROUTE_INDEX = '/';
|
||||
export const ROUTE_APP_CHAT = '/';
|
||||
export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
|
||||
export const ROUTE_APP_NEWS = '/news';
|
||||
const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';
|
||||
|
||||
export const getIndexLink = () => ROUTE_INDEX;
|
||||
|
||||
// Get Paths
|
||||
|
||||
export const getCallbackUrl = (source: 'openrouter') => {
|
||||
const callbackUrl = new URL(window.location.href);
|
||||
switch (source) {
|
||||
case 'openrouter':
|
||||
callbackUrl.pathname = ROUTE_CALLBACK_OPENROUTER;
|
||||
break;
|
||||
default:
|
||||
throw new Error(`Unknown source: ${source}`);
|
||||
}
|
||||
return callbackUrl.toString();
|
||||
};
|
||||
|
||||
export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);
|
||||
|
||||
const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
|
||||
Router[replace ? 'replace' : 'push'](path);
|
||||
|
||||
/// Simple Navigation
|
||||
|
||||
export const navigateToIndex = navigateFn(ROUTE_INDEX);
|
||||
|
||||
export const navigateToChat = async (conversationId?: DConversationId) => {
|
||||
if (conversationId) {
|
||||
await Router.push(
|
||||
@@ -41,6 +57,15 @@ export const navigateToNews = navigateFn(ROUTE_APP_NEWS);
|
||||
|
||||
export const navigateBack = Router.back;
|
||||
|
||||
export const reloadPage = () => isBrowser && window.location.reload();
|
||||
|
||||
function navigateFn(path: string) {
|
||||
return (replace?: boolean): Promise<boolean> => Router[replace ? 'replace' : 'push'](path);
|
||||
}
|
||||
|
||||
|
||||
/// Launch Apps
|
||||
|
||||
export interface AppCallQueryParams {
|
||||
conversationId: string;
|
||||
personaId: string;
|
||||
|
||||
@@ -46,6 +46,7 @@ export const appTheme = extendTheme({
|
||||
text: {
|
||||
icon: 'var(--joy-palette-neutral-700)', // <IconButton color='neutral' /> icon color
|
||||
secondary: 'var(--joy-palette-neutral-800)', // increase contrast a bit
|
||||
// tertiary: 'var(--joy-palette-neutral-700)', // increase contrast a bit
|
||||
},
|
||||
// popup [white] > surface [50] > level1 [100] > level2 [200] > level3 [300] > body [white -> 400]
|
||||
background: {
|
||||
|
||||
@@ -23,7 +23,7 @@ export function GoodModal(props: {
|
||||
const showBottomClose = !!props.onClose && props.hideBottomClose !== true;
|
||||
return (
|
||||
<Modal open={props.open} onClose={props.onClose}>
|
||||
<ModalOverflow>
|
||||
<ModalOverflow sx={{p:1}}>
|
||||
<ModalDialog
|
||||
sx={{
|
||||
minWidth: { xs: 360, sm: 500, md: 600, lg: 700 },
|
||||
|
||||
@@ -0,0 +1,10 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { SvgIcon } from '@mui/joy';
|
||||
import { SxProps } from '@mui/joy/styles/types';
|
||||
|
||||
export function MistralIcon(props: { sx?: SxProps }) {
|
||||
return <SvgIcon viewBox='0 0 24 24' width='24' height='24' strokeWidth={0} stroke='none' fill='currentColor' strokeLinecap='butt' strokeLinejoin='miter' {...props}>
|
||||
<path d='m 2,2 v 4 4 V 14 v 4 4 h 4 v -4 -4 h 4 v 4 h 4 v -4 h 4 v 4 4 h 4 v -4 -4 -4 -4 V 2 h -4 v 4 h -4 v 4 h -4 v -4 H 6 V 2 Z' />
|
||||
</SvgIcon>;
|
||||
}
|
||||
@@ -21,8 +21,13 @@ export const useGlobalShortcut = (shortcutKey: string | false, useCtrl: boolean,
|
||||
if (!shortcutKey) return;
|
||||
const lcShortcut = shortcutKey.toLowerCase();
|
||||
const handleKeyDown = (event: KeyboardEvent) => {
|
||||
if ((useCtrl === event.ctrlKey) && (useShift === event.shiftKey) && (useAlt === event.altKey)
|
||||
&& event.key.toLowerCase() === lcShortcut) {
|
||||
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
|
||||
if (
|
||||
(useCtrl === isCtrlOrCmd) &&
|
||||
(useShift === event.shiftKey) &&
|
||||
(useAlt === event.altKey) &&
|
||||
event.key.toLowerCase() === lcShortcut
|
||||
) {
|
||||
event.preventDefault();
|
||||
event.stopPropagation();
|
||||
callback();
|
||||
@@ -46,9 +51,10 @@ export const useGlobalShortcuts = (shortcuts: GlobalShortcutItem[]) => {
|
||||
React.useEffect(() => {
|
||||
const handleKeyDown = (event: KeyboardEvent) => {
|
||||
for (const [key, useCtrl, useShift, useAlt, action] of shortcuts) {
|
||||
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
|
||||
if (
|
||||
key &&
|
||||
(useCtrl === event.ctrlKey) &&
|
||||
(useCtrl === isCtrlOrCmd) &&
|
||||
(useShift === event.shiftKey) &&
|
||||
(useAlt === event.altKey) &&
|
||||
event.key.toLowerCase() === key.toLowerCase()
|
||||
|
||||
@@ -0,0 +1,95 @@
|
||||
import * as React from 'react';
|
||||
|
||||
/**
|
||||
* The AloneDetector class checks if the current client is the only one present for a given app. It uses
|
||||
* BroadcastChannel to talk to other clients. If no other clients reply within a short time, it assumes it's
|
||||
* the only one and tells the caller.
|
||||
*/
|
||||
class AloneDetector {
|
||||
private readonly clientId: string;
|
||||
private readonly broadcastChannel: BroadcastChannel;
|
||||
|
||||
private aloneCallback: ((isAlone: boolean) => void) | null;
|
||||
private aloneTimerId: number | undefined;
|
||||
|
||||
constructor(channelName: string, onAlone: (isAlone: boolean) => void) {
|
||||
|
||||
this.clientId = Math.random().toString(36).substring(2, 10);
|
||||
this.aloneCallback = onAlone;
|
||||
|
||||
this.broadcastChannel = new BroadcastChannel(channelName);
|
||||
this.broadcastChannel.onmessage = this.handleIncomingMessage;
|
||||
|
||||
}
|
||||
|
||||
public onUnmount(): void {
|
||||
// close channel
|
||||
this.broadcastChannel.onmessage = null;
|
||||
this.broadcastChannel.close();
|
||||
|
||||
// clear timeout
|
||||
if (this.aloneTimerId)
|
||||
clearTimeout(this.aloneTimerId);
|
||||
|
||||
this.aloneTimerId = undefined;
|
||||
this.aloneCallback = null;
|
||||
}
|
||||
|
||||
public checkIfAlone(): void {
|
||||
|
||||
// triggers other clients
|
||||
this.broadcastChannel.postMessage({ type: 'CHECK', sender: this.clientId });
|
||||
|
||||
// if no response within 500ms, assume this client is alone
|
||||
this.aloneTimerId = window.setTimeout(() => {
|
||||
this.aloneTimerId = undefined;
|
||||
this.aloneCallback?.(true);
|
||||
}, 500);
|
||||
|
||||
}
|
||||
|
||||
private handleIncomingMessage = (event: MessageEvent): void => {
|
||||
|
||||
// ignore self messages
|
||||
if (event.data.sender === this.clientId) return;
|
||||
|
||||
switch (event.data.type) {
|
||||
|
||||
case 'CHECK':
|
||||
this.broadcastChannel.postMessage({ type: 'ALIVE', sender: this.clientId });
|
||||
break;
|
||||
|
||||
case 'ALIVE':
|
||||
// received an ALIVE message, tell the client they're not alone
|
||||
if (this.aloneTimerId) {
|
||||
clearTimeout(this.aloneTimerId);
|
||||
this.aloneTimerId = undefined;
|
||||
}
|
||||
this.aloneCallback?.(false);
|
||||
this.aloneCallback = null;
|
||||
break;
|
||||
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* React hook that checks whether the current tab is the only one open for a specific channel.
|
||||
*
|
||||
* @param {string} channelName - The name of the BroadcastChannel to communicate on.
|
||||
* @returns {boolean | null} - True if the current tab is alone, false if not, or null before the check completes.
|
||||
*/
|
||||
export function useSingleTabEnforcer(channelName: string): boolean | null {
|
||||
const [isAlone, setIsAlone] = React.useState<boolean | null>(null);
|
||||
|
||||
React.useEffect(() => {
|
||||
const tabManager = new AloneDetector(channelName, setIsAlone);
|
||||
tabManager.checkIfAlone();
|
||||
return () => {
|
||||
tabManager.onUnmount();
|
||||
};
|
||||
}, [channelName]);
|
||||
|
||||
return isAlone;
|
||||
}
|
||||
@@ -9,6 +9,7 @@ export type DropdownItems = Record<string, {
|
||||
title: string,
|
||||
symbol?: string,
|
||||
type?: 'separator'
|
||||
icon?: React.ReactNode,
|
||||
}>;
|
||||
|
||||
|
||||
@@ -71,20 +72,25 @@ export function AppBarDropdown<TValue extends string>(props: {
|
||||
{!!props.prependOption && Object.keys(props.items).length >= 1 && <Divider />}
|
||||
|
||||
<Box sx={{ overflowY: 'auto' }}>
|
||||
{Object.keys(props.items).map((key: string, idx: number) => <React.Fragment key={'key-' + idx}>
|
||||
{props.items[key].type === 'separator'
|
||||
? <ListDivider />
|
||||
: <Option value={key} sx={{ whiteSpace: 'nowrap' }}>
|
||||
{props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{props.items[key]?.symbol + ' '}</ListItemDecorator>}
|
||||
{props.items[key].title}
|
||||
{Object.keys(props.items).map((key: string, idx: number) => {
|
||||
const item = props.items[key];
|
||||
|
||||
if (item.type === 'separator')
|
||||
return <ListDivider key={'key-' + idx} />;
|
||||
|
||||
return (
|
||||
<Option key={'key-' + idx} value={key} sx={{ whiteSpace: 'nowrap' }}>
|
||||
{props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{item?.symbol + ' '}</ListItemDecorator>}
|
||||
{props.showSymbols && !!item.icon && <ListItemDecorator>{item?.icon}</ListItemDecorator>}
|
||||
{item.title}
|
||||
{/*{key === props.value && (*/}
|
||||
{/* <IconButton variant='soft' onClick={() => alert('aa')} sx={{ ml: 'auto' }}>*/}
|
||||
{/* <SettingsIcon color='success' />*/}
|
||||
{/* </IconButton>*/}
|
||||
{/*)}*/}
|
||||
</Option>
|
||||
}
|
||||
</React.Fragment>)}
|
||||
);
|
||||
})}
|
||||
</Box>
|
||||
|
||||
{!!props.appendOption && Object.keys(props.items).length >= 1 && <ListDivider />}
|
||||
|
||||
@@ -3,7 +3,7 @@ import { shallow } from 'zustand/shallow';
|
||||
|
||||
import { Box, Container } from '@mui/joy';
|
||||
|
||||
import { ModelsModal } from '../../apps/models-modal/ModelsModal';
|
||||
import { ModelsModal } from '~/modules/llms/models-modal/ModelsModal';
|
||||
import { SettingsModal } from '../../apps/settings-modal/SettingsModal';
|
||||
import { ShortcutsModal } from '../../apps/settings-modal/ShortcutsModal';
|
||||
|
||||
|
||||
@@ -0,0 +1,42 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Button, Sheet, Typography } from '@mui/joy';
|
||||
|
||||
import { Brand } from '../app.config';
|
||||
import { reloadPage } from '../app.routes';
|
||||
import { useSingleTabEnforcer } from '../components/useSingleTabEnforcer';
|
||||
|
||||
|
||||
export const ProviderSingleTab = (props: { children: React.ReactNode }) => {
|
||||
|
||||
// state
|
||||
const isSingleTab = useSingleTabEnforcer('big-agi-tabs');
|
||||
|
||||
// pass-through until we know for sure that other tabs are open
|
||||
if (isSingleTab === null || isSingleTab)
|
||||
return props.children;
|
||||
|
||||
|
||||
return (
|
||||
<Sheet
|
||||
variant='solid'
|
||||
invertedColors
|
||||
sx={{
|
||||
flexGrow: 1,
|
||||
display: 'flex', flexDirection: { xs: 'column', md: 'row' }, justifyContent: 'center', alignItems: 'center', gap: 2,
|
||||
p: 3,
|
||||
}}
|
||||
>
|
||||
|
||||
<Typography>
|
||||
It looks like {Brand.Title.Base} is already running in another tab or window.
|
||||
To continue here, please close the other instance first.
|
||||
</Typography>
|
||||
|
||||
<Button onClick={reloadPage}>
|
||||
Reload
|
||||
</Button>
|
||||
|
||||
</Sheet>
|
||||
);
|
||||
};
|
||||
@@ -2,11 +2,13 @@
|
||||
* @fileoverview Utility functions for Markdown.
|
||||
*/
|
||||
|
||||
import { isBrowser } from '~/common/util/pwaUtils';
|
||||
|
||||
/**
|
||||
* Quick and dirty conversion of HTML tables to Markdown tables.
|
||||
* Big plus: doesn't require any dependencies.
|
||||
*/
|
||||
export function htmlTableToMarkdown(html: string): string {
|
||||
export function htmlTableToMarkdown(html: string, includeInvisible: boolean): string {
|
||||
const parser = new DOMParser();
|
||||
const doc = parser.parseFromString(html, 'text/html');
|
||||
const table = doc.querySelector('table');
|
||||
@@ -16,20 +18,53 @@ export function htmlTableToMarkdown(html: string): string {
|
||||
const headerCells = table.querySelectorAll('thead th');
|
||||
if (headerCells.length > 0) {
|
||||
const headerRow = '| ' + Array.from(headerCells)
|
||||
.map(cell => cell.textContent?.trim() || '')
|
||||
.join(' | ') + '| ';
|
||||
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
|
||||
.join(' | ') + ' |';
|
||||
markdownRows.push(headerRow);
|
||||
markdownRows.push('|:' + Array(headerCells.length).fill('-').join('|:') + '|');
|
||||
markdownRows.push('|:' + Array(headerCells.length).fill('---').join('|:') + '|');
|
||||
}
|
||||
|
||||
const bodyRows = table.querySelectorAll('tbody tr');
|
||||
for (const row of Array.from(bodyRows)) {
|
||||
const rowCells = row.querySelectorAll('td');
|
||||
const markdownRow = '| ' + Array.from(rowCells)
|
||||
.map(cell => cell.textContent?.trim() || '')
|
||||
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
|
||||
.join(' | ') + ' |';
|
||||
markdownRows.push(markdownRow);
|
||||
}
|
||||
|
||||
return markdownRows.join('\n');
|
||||
}
|
||||
|
||||
// Helper function to get text with spaces, ignoring hidden elements
|
||||
function getTextWithSpaces(node: Node, includeInvisible: boolean): string {
|
||||
let text = '';
|
||||
node.childNodes.forEach(child => {
|
||||
if (child.nodeType === Node.TEXT_NODE)
|
||||
text += child.textContent;
|
||||
else if (child.nodeType === Node.ELEMENT_NODE)
|
||||
if (includeInvisible || isVisible(child as Element))
|
||||
text += ' ' + getTextWithSpaces(child, includeInvisible) + ' ';
|
||||
});
|
||||
return text;
|
||||
}
|
||||
|
||||
// Helper function to determine if an element is visible
|
||||
function isVisible(element: Element): boolean {
|
||||
if (!isBrowser) return true;
|
||||
|
||||
// if the cell is hidden, don't include it
|
||||
const style = window.getComputedStyle(element);
|
||||
if (style.display === 'none' || style.visibility === 'hidden')
|
||||
return false;
|
||||
|
||||
// Check for common classes used to hide content or indicate tooltip/popover content.
|
||||
// You may need to add more classes here based on your actual HTML/CSS.
|
||||
const ignoredClasses = ['hidden', 'group-hover', 'tooltip', 'pointer-events-none', 'opacity-0'];
|
||||
for (const ignoredClass of ignoredClasses)
|
||||
if (element.classList.contains(ignoredClass))
|
||||
return false;
|
||||
|
||||
// Otherwise, the element is considered visible
|
||||
return true;
|
||||
}
|
||||
@@ -14,7 +14,7 @@ export async function pdfToText(pdfBuffer: ArrayBuffer): Promise<string> {
|
||||
const { getDocument, GlobalWorkerOptions } = await import('pdfjs-dist');
|
||||
|
||||
// Set the worker script path
|
||||
GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.js';
|
||||
GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.mjs';
|
||||
|
||||
const pdf = await getDocument(pdfBuffer).promise;
|
||||
const textPages: string[] = []; // Initialize an array to hold text from all pages
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerateWithFunctions, VChatFunctionIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
import { useChatStore } from '~/common/state/store-chats';
|
||||
@@ -71,7 +71,7 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
|
||||
// Follow-up: Question
|
||||
if (suggestQuestions) {
|
||||
// callChatGenerateWithFunctions(funcLLMId, [
|
||||
// llmChatGenerateOrThrow(funcLLMId, [
|
||||
// { role: 'system', content: systemMessage.text },
|
||||
// { role: 'user', content: userMessage.text },
|
||||
// { role: 'assistant', content: assistantMessageText },
|
||||
@@ -83,15 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
|
||||
// Follow-up: Auto-Diagrams
|
||||
if (suggestDiagrams) {
|
||||
void callChatGenerateWithFunctions(funcLLMId, [
|
||||
void llmChatGenerateOrThrow(funcLLMId, [
|
||||
{ role: 'system', content: systemMessage.text },
|
||||
{ role: 'user', content: userMessage.text },
|
||||
{ role: 'assistant', content: assistantMessageText },
|
||||
], [suggestPlantUMLFn], 'draw_plantuml_diagram',
|
||||
).then(chatResponse => {
|
||||
|
||||
if (!('function_arguments' in chatResponse))
|
||||
return;
|
||||
|
||||
// parse the output PlantUML string, if any
|
||||
const functionArguments = chatResponse?.function_arguments ?? null;
|
||||
const functionArguments = chatResponse.function_arguments ?? null;
|
||||
if (functionArguments) {
|
||||
const { code, type }: { code: string, type: string } = functionArguments as any;
|
||||
if (code && type) {
|
||||
@@ -105,6 +108,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
editMessage(conversationId, assistantMessageId, { text: assistantMessageText }, false);
|
||||
}
|
||||
}
|
||||
}).catch(err => {
|
||||
console.error('autoSuggestions::diagram:', err);
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
import { useChatStore } from '~/common/state/store-chats';
|
||||
@@ -27,7 +27,7 @@ export function autoTitle(conversationId: string) {
|
||||
});
|
||||
|
||||
// LLM
|
||||
void callChatGenerate(fastLLMId, [
|
||||
void llmChatGenerateOrThrow(fastLLMId, [
|
||||
{ role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
|
||||
{
|
||||
role: 'user', content:
|
||||
@@ -39,7 +39,7 @@ export function autoTitle(conversationId: string) {
|
||||
historyLines.join('\n') +
|
||||
'```\n',
|
||||
},
|
||||
]).then(chatResponse => {
|
||||
], null, null).then(chatResponse => {
|
||||
|
||||
const title = chatResponse?.content
|
||||
?.trim()
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton } from '@mui/joy';
|
||||
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton, Input, FormControl, FormLabel } from '@mui/joy';
|
||||
import AccountTreeIcon from '@mui/icons-material/AccountTree';
|
||||
import ExpandLessIcon from '@mui/icons-material/ExpandLess';
|
||||
import ExpandMoreIcon from '@mui/icons-material/ExpandMore';
|
||||
@@ -8,8 +8,9 @@ import ReplayIcon from '@mui/icons-material/Replay';
|
||||
import StopOutlinedIcon from '@mui/icons-material/StopOutlined';
|
||||
import TelegramIcon from '@mui/icons-material/Telegram';
|
||||
|
||||
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
|
||||
|
||||
import { ChatMessage } from '../../../apps/chat/components/message/ChatMessage';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
|
||||
import { GoodModal } from '~/common/components/GoodModal';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
@@ -48,6 +49,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
const [message, setMessage] = React.useState<DMessage | null>(null);
|
||||
const [diagramType, diagramComponent] = useFormRadio<DiagramType>('auto', diagramTypes, 'Visualize');
|
||||
const [diagramLanguage, languageComponent] = useFormRadio<DiagramLanguage>('plantuml', diagramLanguages, 'Style');
|
||||
const [customInstruction, setCustomInstruction] = React.useState<string>('');
|
||||
const [errorMessage, setErrorMessage] = React.useState<string | null>(null);
|
||||
const [abortController, setAbortController] = React.useState<AbortController | null>(null);
|
||||
|
||||
@@ -81,10 +83,10 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
const stepAbortController = new AbortController();
|
||||
setAbortController(stepAbortController);
|
||||
|
||||
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject);
|
||||
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);
|
||||
|
||||
try {
|
||||
await streamChat(diagramLlm.id, diagramPrompt, stepAbortController.signal,
|
||||
await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
|
||||
(update: Partial<{ text: string, typing: boolean, originLLM: string }>) => {
|
||||
assistantMessage = { ...assistantMessage, ...update };
|
||||
setMessage(assistantMessage);
|
||||
@@ -103,7 +105,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
setAbortController(null);
|
||||
}
|
||||
|
||||
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject]);
|
||||
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);
|
||||
|
||||
|
||||
// [Effect] Auto-abort on unmount
|
||||
@@ -149,6 +151,12 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
<Grid xs={12} xl={6}>
|
||||
{llmComponent}
|
||||
</Grid>
|
||||
<Grid xs={12} md={6}>
|
||||
<FormControl>
|
||||
<FormLabel>Custom Instruction</FormLabel>
|
||||
<Input title="Custom Instruction" placeholder='e.g. visualize as state' value={customInstruction} onChange={(e) => setCustomInstruction(e.target.value)} />
|
||||
</FormControl>
|
||||
</Grid>
|
||||
</Grid>
|
||||
)}
|
||||
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
|
||||
import type { FormRadioOption } from '~/common/components/forms/FormRadioControl';
|
||||
import type { VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export type DiagramType = 'auto' | 'mind';
|
||||
@@ -60,12 +59,15 @@ function plantumlDiagramPrompt(diagramType: DiagramType): { sys: string, usr: st
|
||||
}
|
||||
}
|
||||
|
||||
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string): VChatMessageIn[] {
|
||||
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string, customInstruction: string): VChatMessageIn[] {
|
||||
const { sys, usr } = diagramLanguage === 'mermaid' ? mermaidDiagramPrompt(diagramType) : plantumlDiagramPrompt(diagramType);
|
||||
if (customInstruction) {
|
||||
customInstruction = 'Also consider the following instructions: ' + customInstruction;
|
||||
}
|
||||
return [
|
||||
{ role: 'system', content: sys },
|
||||
{ role: 'system', content: chatSystemPrompt },
|
||||
{ role: 'assistant', content: subject },
|
||||
{ role: 'user', content: usr },
|
||||
{ role: 'user', content: `${usr} ${customInstruction}` },
|
||||
];
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
|
||||
@@ -14,10 +14,10 @@ export async function imaginePromptFromText(messageText: string): Promise<string
|
||||
const { fastLLMId } = useModelsStore.getState();
|
||||
if (!fastLLMId) return null;
|
||||
try {
|
||||
const chatResponse = await callChatGenerate(fastLLMId, [
|
||||
const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
|
||||
{ role: 'system', content: simpleImagineSystemPrompt },
|
||||
{ role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
|
||||
]);
|
||||
], null, null);
|
||||
return chatResponse.content?.trim() ?? null;
|
||||
} catch (error: any) {
|
||||
console.error('imaginePromptFromText: fetch request error:', error);
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
import { DLLMId } from '~/modules/llms/store-llms';
|
||||
import { callApiSearchGoogle } from '~/modules/google/search.client';
|
||||
import { callBrowseFetchPage } from '~/modules/browse/browse.client';
|
||||
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
// prompt to implement the ReAct paradigm: https://arxiv.org/abs/2210.03629
|
||||
@@ -128,7 +128,7 @@ export class Agent {
|
||||
S.messages.push({ role: 'user', content: prompt });
|
||||
let content: string;
|
||||
try {
|
||||
content = (await callChatGenerate(llmId, S.messages, 500)).content;
|
||||
content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
|
||||
} catch (error: any) {
|
||||
content = `Error in callChat: ${error}`;
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
// prompt to be tried when doing recursive summerization.
|
||||
@@ -80,10 +80,10 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
|
||||
const autoResponseTokensSize = Math.floor(contextTokens * outputTokenShare);
|
||||
|
||||
try {
|
||||
const chatResponse = await callChatGenerate(llmId, [
|
||||
const chatResponse = await llmChatGenerateOrThrow(llmId, [
|
||||
{ role: 'system', content: cleanupPrompt },
|
||||
{ role: 'user', content: chunk },
|
||||
], autoResponseTokensSize);
|
||||
], null, null, autoResponseTokensSize);
|
||||
return chatResponse?.content ?? '';
|
||||
} catch (error: any) {
|
||||
return '';
|
||||
|
||||
@@ -1,8 +1,7 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import type { DLLMId } from '~/modules/llms/store-llms';
|
||||
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export function useStreamChatText() {
|
||||
@@ -25,7 +24,7 @@ export function useStreamChatText() {
|
||||
|
||||
try {
|
||||
let lastText = '';
|
||||
await streamChat(llmId, prompt, abortControllerRef.current.signal, (update) => {
|
||||
await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, (update) => {
|
||||
if (update.text) {
|
||||
lastText = update.text;
|
||||
setPartialText(lastText);
|
||||
|
||||
@@ -1,5 +1,10 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
import type { BackendCapabilities } from '~/modules/backend/state-backend';
|
||||
|
||||
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
|
||||
import { env } from '~/server/env.mjs';
|
||||
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { analyticsListCapabilities } from './backend.analytics';
|
||||
|
||||
@@ -23,11 +28,26 @@ export const backendRouter = createTRPCRouter({
|
||||
hasImagingProdia: !!env.PRODIA_API_KEY,
|
||||
hasLlmAnthropic: !!env.ANTHROPIC_API_KEY,
|
||||
hasLlmAzureOpenAI: !!env.AZURE_OPENAI_API_KEY && !!env.AZURE_OPENAI_API_ENDPOINT,
|
||||
hasLlmGemini: !!env.GEMINI_API_KEY,
|
||||
hasLlmMistral: !!env.MISTRAL_API_KEY,
|
||||
hasLlmOllama: !!env.OLLAMA_API_HOST,
|
||||
hasLlmOpenAI: !!env.OPENAI_API_KEY || !!env.OPENAI_API_HOST,
|
||||
hasLlmOpenRouter: !!env.OPENROUTER_API_KEY,
|
||||
hasVoiceElevenLabs: !!env.ELEVENLABS_API_KEY,
|
||||
};
|
||||
} satisfies BackendCapabilities;
|
||||
}),
|
||||
|
||||
|
||||
// The following are used for various OAuth integrations
|
||||
|
||||
/* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
|
||||
exchangeOpenRouterKey: publicProcedure
|
||||
.input(z.object({ code: z.string() }))
|
||||
.query(async ({ input }) => {
|
||||
// Documented here: https://openrouter.ai/docs#oauth
|
||||
return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
|
||||
code: input.code,
|
||||
}, 'Backend.exchangeOpenRouterKey');
|
||||
}),
|
||||
|
||||
});
|
||||
@@ -9,6 +9,8 @@ export interface BackendCapabilities {
|
||||
hasImagingProdia: boolean;
|
||||
hasLlmAnthropic: boolean;
|
||||
hasLlmAzureOpenAI: boolean;
|
||||
hasLlmGemini: boolean;
|
||||
hasLlmMistral: boolean;
|
||||
hasLlmOllama: boolean;
|
||||
hasLlmOpenAI: boolean;
|
||||
hasLlmOpenRouter: boolean;
|
||||
@@ -30,6 +32,8 @@ const useBackendStore = create<BackendStore>()(
|
||||
hasImagingProdia: false,
|
||||
hasLlmAnthropic: false,
|
||||
hasLlmAzureOpenAI: false,
|
||||
hasLlmGemini: false,
|
||||
hasLlmMistral: false,
|
||||
hasLlmOllama: false,
|
||||
hasLlmOpenAI: false,
|
||||
hasLlmOpenRouter: false,
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import create from 'zustand';
|
||||
import { create } from 'zustand';
|
||||
import { persist } from 'zustand/middleware';
|
||||
|
||||
import { CapabilityBrowsing } from '~/common/components/useCapabilities';
|
||||
|
||||
@@ -0,0 +1,74 @@
|
||||
import type { DLLMId } from './store-llms';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import { findVendorForLlmOrThrow } from './vendors/vendors.registry';
|
||||
|
||||
|
||||
// LLM Client Types
|
||||
// NOTE: Model List types in '../server/llm.server.types';
|
||||
|
||||
export interface VChatMessageIn {
|
||||
role: 'assistant' | 'system' | 'user'; // | 'function';
|
||||
content: string;
|
||||
//name?: string; // when role: 'function'
|
||||
}
|
||||
|
||||
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
|
||||
|
||||
export interface VChatMessageOut {
|
||||
role: 'assistant' | 'system' | 'user';
|
||||
content: string;
|
||||
finish_reason: 'stop' | 'length' | null;
|
||||
}
|
||||
|
||||
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
|
||||
function_name: string;
|
||||
function_arguments: object | null;
|
||||
}
|
||||
|
||||
|
||||
// LLM Client Functions
|
||||
|
||||
export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
llmId: DLLMId,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {
|
||||
|
||||
// id to DLLM and vendor
|
||||
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
|
||||
|
||||
// FIXME: relax the forced cast
|
||||
const options = llm.options as TLLMOptions;
|
||||
|
||||
// get the access
|
||||
const partialSourceSetup = llm._source.setup;
|
||||
const access = vendor.getTransportAccess(partialSourceSetup);
|
||||
|
||||
// execute via the vendor
|
||||
return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
|
||||
}
|
||||
|
||||
|
||||
export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
llmId: DLLMId,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null,
|
||||
forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
): Promise<void> {
|
||||
|
||||
// id to DLLM and vendor
|
||||
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
|
||||
|
||||
// FIXME: relax the forced cast
|
||||
const llmOptions = llm.options as TLLMOptions;
|
||||
|
||||
// get the access
|
||||
const partialSourceSetup = llm._source.setup;
|
||||
const access = vendor.getTransportAccess(partialSourceSetup); // as ChatStreamInputSchema['access'];
|
||||
|
||||
// execute via the vendor
|
||||
return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
|
||||
}
|
||||
+4
-4
@@ -7,7 +7,7 @@ import VisibilityIcon from '@mui/icons-material/Visibility';
|
||||
import VisibilityOffIcon from '@mui/icons-material/VisibilityOff';
|
||||
|
||||
import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
|
||||
import { findVendorById } from '~/modules/llms/vendors/vendors.registry';
|
||||
|
||||
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
|
||||
import { GoodModal } from '~/common/components/GoodModal';
|
||||
@@ -117,9 +117,9 @@ export function LLMOptionsModal(props: { id: DLLMId }) {
|
||||
<FormLabelStart title='Details' sx={{ minWidth: 80 }} onClick={() => setShowDetails(!showDetails)} />
|
||||
{showDetails && <Typography level='body-sm' sx={{ display: 'block' }}>
|
||||
[{llm.id}]: {llm.options.llmRef && `${llm.options.llmRef} · `}
|
||||
{llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
|
||||
{llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
|
||||
{llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
|
||||
{!!llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
|
||||
{!!llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
|
||||
{!!llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
|
||||
description: {llm.description}
|
||||
{/*· tags: {llm.tags.join(', ')}*/}
|
||||
</Typography>}
|
||||
@@ -7,7 +7,7 @@ import VisibilityOffOutlinedIcon from '@mui/icons-material/VisibilityOffOutlined
|
||||
|
||||
import { DLLM, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
|
||||
import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
|
||||
import { findVendorById } from '~/modules/llms/vendors/vendors.registry';
|
||||
|
||||
import { GoodTooltip } from '~/common/components/GoodTooltip';
|
||||
import { openLayoutLLMOptions } from '~/common/layout/store-applayout';
|
||||
@@ -109,8 +109,15 @@ export function ModelsList(props: {
|
||||
<List variant='soft' size='sm' sx={{
|
||||
borderRadius: 'sm',
|
||||
pl: { xs: 0, md: 1 },
|
||||
overflowY: 'auto',
|
||||
}}>
|
||||
{items}
|
||||
{items.length > 0 ? items : (
|
||||
<ListItem>
|
||||
<Typography level='body-sm'>
|
||||
Please configure the service and update the list of models.
|
||||
</Typography>
|
||||
</ListItem>
|
||||
)}
|
||||
</List>
|
||||
);
|
||||
}
|
||||
+2
-2
@@ -4,7 +4,7 @@ import { shallow } from 'zustand/shallow';
|
||||
import { Box, Checkbox, Divider } from '@mui/joy';
|
||||
|
||||
import { DModelSource, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendor.registry';
|
||||
import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendors.registry';
|
||||
|
||||
import { GoodModal } from '~/common/components/GoodModal';
|
||||
import { closeLayoutModelsSetup, openLayoutModelsSetup, useLayoutModelsSetup } from '~/common/layout/store-applayout';
|
||||
@@ -65,7 +65,7 @@ export function ModelsModal(props: { suspendAutoModelsSetup?: boolean }) {
|
||||
title={<>Configure <b>AI Models</b></>}
|
||||
startButton={
|
||||
multiSource ? <Checkbox
|
||||
label='all vendors' sx={{ my: 'auto' }}
|
||||
label='All Services' sx={{ my: 'auto' }}
|
||||
checked={showAllSources} onChange={() => setShowAllSources(all => !all)}
|
||||
/> : undefined
|
||||
}
|
||||
+9
-5
@@ -5,9 +5,9 @@ import { Avatar, Badge, Box, Button, IconButton, ListItemDecorator, MenuItem, Op
|
||||
import AddIcon from '@mui/icons-material/Add';
|
||||
import DeleteOutlineIcon from '@mui/icons-material/DeleteOutline';
|
||||
|
||||
import { type DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { type IModelVendor, type ModelVendorId } from '~/modules/llms/vendors/IModelVendor';
|
||||
import { createModelSourceForVendor, findAllVendors, findVendorById } from '~/modules/llms/vendors/vendor.registry';
|
||||
import type { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
|
||||
import { DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { createModelSourceForVendor, findAllVendors, findVendorById, ModelVendorId } from '~/modules/llms/vendors/vendors.registry';
|
||||
|
||||
import { CloseableMenu } from '~/common/components/CloseableMenu';
|
||||
import { ConfirmationModal } from '~/common/components/ConfirmationModal';
|
||||
@@ -29,7 +29,7 @@ function vendorIcon(vendor: IModelVendor | null, greenMark: boolean) {
|
||||
icon = <vendor.Icon />;
|
||||
}
|
||||
return (greenMark && icon)
|
||||
? <Badge color='primary' size='sm' badgeContent=''>{icon}</Badge>
|
||||
? <Badge color='success' size='sm' badgeContent=''>{icon}</Badge>
|
||||
: icon;
|
||||
}
|
||||
|
||||
@@ -92,7 +92,11 @@ export function ModelsSourceSelector(props: {
|
||||
<ListItemDecorator>
|
||||
{vendorIcon(vendor, !!vendor.hasBackendCap && vendor.hasBackendCap())}
|
||||
</ListItemDecorator>
|
||||
{vendor.name}{/*{sourceCount > 0 && ` (added)`}*/}
|
||||
{vendor.name}
|
||||
{/*{sourceCount > 0 && ` (added)`}*/}
|
||||
{!!vendor.hasFreeModels && ` 🎁`}
|
||||
{/*{!!vendor.instanceLimit && ` (${sourceCount}/${vendor.instanceLimit})`}*/}
|
||||
{vendor.location === 'local' && <span style={{ opacity: 0.5 }}>local</span>}
|
||||
</MenuItem>
|
||||
),
|
||||
};
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
import type { ModelDescriptionSchema } from '../server.schemas';
|
||||
import type { ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { LLM_IF_OAI_Chat } from '../../../store-llms';
|
||||
import { LLM_IF_OAI_Chat } from '../../store-llms';
|
||||
|
||||
const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);
|
||||
|
||||
+1
-1
@@ -6,7 +6,7 @@ import { env } from '~/server/env.mjs';
|
||||
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
import { listModelsOutputSchema } from '../server.schemas';
|
||||
import { listModelsOutputSchema } from '../llm.server.types';
|
||||
|
||||
import { AnthropicWire } from './anthropic.wiretypes';
|
||||
import { hardcodedAnthropicModels } from './anthropic.models';
|
||||
@@ -0,0 +1,216 @@
|
||||
import { z } from 'zod';
|
||||
import { TRPCError } from '@trpc/server';
|
||||
import { env } from '~/server/env.mjs';
|
||||
|
||||
import packageJson from '../../../../../package.json';
|
||||
|
||||
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
|
||||
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
|
||||
import { GeminiBlockSafetyLevel, geminiBlockSafetyLevelSchema, GeminiContentSchema, GeminiGenerateContentRequest, geminiGeneratedContentResponseSchema, geminiModelsGenerateContentPath, geminiModelsListOutputSchema, geminiModelsListPath } from './gemini.wiretypes';
|
||||
|
||||
|
||||
// Default hosts
|
||||
const DEFAULT_GEMINI_HOST = 'https://generativelanguage.googleapis.com';
|
||||
|
||||
|
||||
// Mappers
|
||||
|
||||
export function geminiAccess(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string): { headers: HeadersInit, url: string } {
|
||||
|
||||
const geminiKey = access.geminiKey || env.GEMINI_API_KEY || '';
|
||||
const geminiHost = fixupHost(DEFAULT_GEMINI_HOST, apiPath);
|
||||
|
||||
// update model-dependent paths
|
||||
if (apiPath.includes('{model=models/*}')) {
|
||||
if (!modelRefId)
|
||||
throw new Error(`geminiAccess: modelRefId is required for ${apiPath}`);
|
||||
apiPath = apiPath.replace('{model=models/*}', modelRefId);
|
||||
}
|
||||
|
||||
return {
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'x-goog-api-client': `big-agi/${packageJson['version'] || '1.0.0'}`,
|
||||
'x-goog-api-key': geminiKey,
|
||||
},
|
||||
url: geminiHost + apiPath,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* We specially encode the history to match the Gemini API requirements.
|
||||
* Gemini does not want 2 consecutive messages from the same role, so we alternate.
|
||||
* - System messages = [User, Model'Ok']
|
||||
* - User and Assistant messages are coalesced into a single message (e.g. [User, User, Assistant, Assistant, User] -> [User[2], Assistant[2], User[1]])
|
||||
*/
|
||||
export const geminiGenerateContentTextPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, safety: GeminiBlockSafetyLevel, n: number): GeminiGenerateContentRequest => {
|
||||
|
||||
// convert the history to a Gemini format
|
||||
const contents: GeminiContentSchema[] = [];
|
||||
for (const _historyElement of history) {
|
||||
|
||||
const { role: msgRole, content: msgContent } = _historyElement;
|
||||
|
||||
// System message - we treat it as per the example in https://ai.google.dev/tutorials/ai-studio_quickstart#chat_example
|
||||
if (msgRole === 'system') {
|
||||
contents.push({ role: 'user', parts: [{ text: msgContent }] });
|
||||
contents.push({ role: 'model', parts: [{ text: 'Ok' }] });
|
||||
continue;
|
||||
}
|
||||
|
||||
// User or Assistant message
|
||||
const nextRole: GeminiContentSchema['role'] = msgRole === 'assistant' ? 'model' : 'user';
|
||||
if (contents.length && contents[contents.length - 1].role === nextRole) {
|
||||
// coalesce with the previous message
|
||||
contents[contents.length - 1].parts.push({ text: msgContent });
|
||||
} else {
|
||||
// create a new message
|
||||
contents.push({ role: nextRole, parts: [{ text: msgContent }] });
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
contents,
|
||||
generationConfig: {
|
||||
...(n >= 2 && { candidateCount: n }),
|
||||
...(model.maxTokens && { maxOutputTokens: model.maxTokens }),
|
||||
temperature: model.temperature,
|
||||
},
|
||||
safetySettings: safety !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED' ? [
|
||||
{ category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_HATE_SPEECH', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_HARASSMENT', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: safety },
|
||||
] : undefined,
|
||||
};
|
||||
};
|
||||
|
||||
|
||||
async function geminiGET<TOut extends object>(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
|
||||
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
|
||||
return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, 'Gemini');
|
||||
}
|
||||
|
||||
async function geminiPOST<TOut extends object, TPostBody extends object>(access: GeminiAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
|
||||
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
|
||||
return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, 'Gemini');
|
||||
}
|
||||
|
||||
|
||||
// Input/Output Schemas
|
||||
|
||||
export const geminiAccessSchema = z.object({
|
||||
dialect: z.enum(['gemini']),
|
||||
geminiKey: z.string(),
|
||||
minSafetyLevel: geminiBlockSafetyLevelSchema,
|
||||
});
|
||||
export type GeminiAccessSchema = z.infer<typeof geminiAccessSchema>;
|
||||
|
||||
|
||||
const accessOnlySchema = z.object({
|
||||
access: geminiAccessSchema,
|
||||
});
|
||||
|
||||
const chatGenerateInputSchema = z.object({
|
||||
access: geminiAccessSchema,
|
||||
model: openAIModelSchema, history: openAIHistorySchema,
|
||||
// functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
|
||||
});
|
||||
|
||||
|
||||
/**
|
||||
* See https://github.com/google/generative-ai-js/tree/main/packages/main/src for
|
||||
* the official Google implementation.
|
||||
*/
|
||||
export const llmGeminiRouter = createTRPCRouter({
|
||||
|
||||
/* [Gemini] models.list = /v1beta/models */
|
||||
listModels: publicProcedure
|
||||
.input(accessOnlySchema)
|
||||
.output(listModelsOutputSchema)
|
||||
.query(async ({ input }) => {
|
||||
|
||||
// get the models
|
||||
const wireModels = await geminiGET(input.access, null, geminiModelsListPath);
|
||||
const detailedModels = geminiModelsListOutputSchema.parse(wireModels).models;
|
||||
|
||||
// NOTE: no need to retrieve info for each of the models (e.g. /v1beta/model/gemini-pro).,
|
||||
// as the List API already all the info on all the models
|
||||
|
||||
// map to our output schema
|
||||
return {
|
||||
models: detailedModels.map((geminiModel) => {
|
||||
const { description, displayName, inputTokenLimit, name, outputTokenLimit, supportedGenerationMethods } = geminiModel;
|
||||
|
||||
const contextWindow = inputTokenLimit + outputTokenLimit;
|
||||
const hidden = !supportedGenerationMethods.includes('generateContent');
|
||||
|
||||
const { version, topK, topP, temperature } = geminiModel;
|
||||
const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;
|
||||
|
||||
// const isGeminiPro = name.includes('gemini-pro');
|
||||
const isGeminiProVision = name.includes('gemini-pro-vision');
|
||||
|
||||
const interfaces: ModelDescriptionSchema['interfaces'] = [];
|
||||
if (supportedGenerationMethods.includes('generateContent')) {
|
||||
interfaces.push(LLM_IF_OAI_Chat);
|
||||
if (isGeminiProVision)
|
||||
interfaces.push(LLM_IF_OAI_Vision);
|
||||
}
|
||||
|
||||
return {
|
||||
id: name,
|
||||
label: displayName,
|
||||
// created: ...
|
||||
// updated: ...
|
||||
description: descriptionLong,
|
||||
contextWindow: contextWindow,
|
||||
maxCompletionTokens: outputTokenLimit,
|
||||
// pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
|
||||
// rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
|
||||
interfaces: supportedGenerationMethods.includes('generateContent') ? [LLM_IF_OAI_Chat] : [],
|
||||
hidden,
|
||||
} satisfies ModelDescriptionSchema;
|
||||
}),
|
||||
};
|
||||
}),
|
||||
|
||||
|
||||
/* [Gemini] models.generateContent = /v1/{model=models/*}:generateContent */
|
||||
chatGenerate: publicProcedure
|
||||
.input(chatGenerateInputSchema)
|
||||
.output(openAIChatGenerateOutputSchema)
|
||||
.mutation(async ({ input: { access, history, model } }) => {
|
||||
|
||||
// generate the content
|
||||
const wireGeneration = await geminiPOST(access, model.id, geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1), geminiModelsGenerateContentPath);
|
||||
const generation = geminiGeneratedContentResponseSchema.parse(wireGeneration);
|
||||
|
||||
// only use the first result (and there should be only one)
|
||||
const singleCandidate = generation.candidates?.[0] ?? null;
|
||||
if (!singleCandidate || !singleCandidate.content?.parts.length)
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Gemini chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
|
||||
});
|
||||
|
||||
if (!('text' in singleCandidate.content.parts[0]))
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Gemini non-text chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
|
||||
});
|
||||
|
||||
return {
|
||||
role: 'assistant',
|
||||
content: singleCandidate.content.parts[0].text || '',
|
||||
finish_reason: singleCandidate.finishReason === 'STOP' ? 'stop' : null,
|
||||
};
|
||||
}),
|
||||
|
||||
});
|
||||
@@ -0,0 +1,188 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
// PATHS
|
||||
|
||||
export const geminiModelsListPath = '/v1beta/models?pageSize=1000';
|
||||
export const geminiModelsGenerateContentPath = '/v1beta/{model=models/*}:generateContent';
|
||||
// see alt=sse on https://cloud.google.com/apis/docs/system-parameters#definitions
|
||||
export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:streamGenerateContent?alt=sse';
|
||||
|
||||
|
||||
// models.list = /v1beta/models
|
||||
|
||||
export const geminiModelsListOutputSchema = z.object({
|
||||
models: z.array(z.object({
|
||||
name: z.string(),
|
||||
version: z.string(),
|
||||
displayName: z.string(),
|
||||
description: z.string(),
|
||||
inputTokenLimit: z.number().int().min(1),
|
||||
outputTokenLimit: z.number().int().min(1),
|
||||
supportedGenerationMethods: z.array(z.enum([
|
||||
'countMessageTokens',
|
||||
'countTextTokens',
|
||||
'countTokens',
|
||||
'createTunedTextModel',
|
||||
'embedContent',
|
||||
'embedText',
|
||||
'generateAnswer',
|
||||
'generateContent',
|
||||
'generateMessage',
|
||||
'generateText',
|
||||
])),
|
||||
temperature: z.number().optional(),
|
||||
topP: z.number().optional(),
|
||||
topK: z.number().optional(),
|
||||
})),
|
||||
});
|
||||
|
||||
|
||||
// /v1/{model=models/*}:generateContent, /v1beta/{model=models/*}:streamGenerateContent
|
||||
|
||||
// Request
|
||||
|
||||
const geminiContentPartSchema = z.union([
|
||||
|
||||
// TextPart
|
||||
z.object({
|
||||
text: z.string().optional(),
|
||||
}),
|
||||
|
||||
// InlineDataPart
|
||||
z.object({
|
||||
inlineData: z.object({
|
||||
mimeType: z.string(),
|
||||
data: z.string(), // base64-encoded string
|
||||
}),
|
||||
}),
|
||||
|
||||
// A predicted FunctionCall returned from the model
|
||||
z.object({
|
||||
functionCall: z.object({
|
||||
name: z.string(),
|
||||
args: z.record(z.any()), // JSON object format
|
||||
}),
|
||||
}),
|
||||
|
||||
// The result output of a FunctionCall
|
||||
z.object({
|
||||
functionResponse: z.object({
|
||||
name: z.string(),
|
||||
response: z.record(z.any()), // JSON object format
|
||||
}),
|
||||
}),
|
||||
]);
|
||||
|
||||
const geminiToolSchema = z.object({
|
||||
functionDeclarations: z.array(z.object({
|
||||
name: z.string(),
|
||||
description: z.string(),
|
||||
parameters: z.record(z.any()).optional(), // Schema object format
|
||||
})).optional(),
|
||||
});
|
||||
|
||||
const geminiHarmCategorySchema = z.enum([
|
||||
'HARM_CATEGORY_UNSPECIFIED',
|
||||
'HARM_CATEGORY_DEROGATORY',
|
||||
'HARM_CATEGORY_TOXICITY',
|
||||
'HARM_CATEGORY_VIOLENCE',
|
||||
'HARM_CATEGORY_SEXUAL',
|
||||
'HARM_CATEGORY_MEDICAL',
|
||||
'HARM_CATEGORY_DANGEROUS',
|
||||
'HARM_CATEGORY_HARASSMENT',
|
||||
'HARM_CATEGORY_HATE_SPEECH',
|
||||
'HARM_CATEGORY_SEXUALLY_EXPLICIT',
|
||||
'HARM_CATEGORY_DANGEROUS_CONTENT',
|
||||
]);
|
||||
|
||||
export const geminiBlockSafetyLevelSchema = z.enum([
|
||||
'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
'BLOCK_LOW_AND_ABOVE',
|
||||
'BLOCK_MEDIUM_AND_ABOVE',
|
||||
'BLOCK_ONLY_HIGH',
|
||||
'BLOCK_NONE',
|
||||
]);
|
||||
|
||||
export type GeminiBlockSafetyLevel = z.infer<typeof geminiBlockSafetyLevelSchema>;
|
||||
|
||||
const geminiSafetySettingSchema = z.object({
|
||||
category: geminiHarmCategorySchema,
|
||||
threshold: geminiBlockSafetyLevelSchema,
|
||||
});
|
||||
|
||||
const geminiGenerationConfigSchema = z.object({
|
||||
stopSequences: z.array(z.string()).optional(),
|
||||
candidateCount: z.number().int().optional(),
|
||||
maxOutputTokens: z.number().int().optional(),
|
||||
temperature: z.number().optional(),
|
||||
topP: z.number().optional(),
|
||||
topK: z.number().int().optional(),
|
||||
});
|
||||
|
||||
const geminiContentSchema = z.object({
|
||||
// Must be either 'user' or 'model'. Optional but must be set if there are multiple "Content" objects in the parent array.
|
||||
role: z.enum(['user', 'model']).optional(),
|
||||
// Ordered Parts that constitute a single message. Parts may have different MIME types.
|
||||
parts: z.array(geminiContentPartSchema),
|
||||
});
|
||||
|
||||
export type GeminiContentSchema = z.infer<typeof geminiContentSchema>;
|
||||
|
||||
export const geminiGenerateContentRequest = z.object({
|
||||
contents: z.array(geminiContentSchema),
|
||||
tools: z.array(geminiToolSchema).optional(),
|
||||
safetySettings: z.array(geminiSafetySettingSchema).optional(),
|
||||
generationConfig: geminiGenerationConfigSchema.optional(),
|
||||
});
|
||||
|
||||
export type GeminiGenerateContentRequest = z.infer<typeof geminiGenerateContentRequest>;
|
||||
|
||||
|
||||
// Response
|
||||
|
||||
const geminiHarmProbabilitySchema = z.enum([
|
||||
'HARM_PROBABILITY_UNSPECIFIED',
|
||||
'NEGLIGIBLE',
|
||||
'LOW',
|
||||
'MEDIUM',
|
||||
'HIGH',
|
||||
]);
|
||||
|
||||
const geminiSafetyRatingSchema = z.object({
|
||||
'category': geminiHarmCategorySchema,
|
||||
'probability': geminiHarmProbabilitySchema,
|
||||
'blocked': z.boolean().optional(),
|
||||
});
|
||||
|
||||
const geminiFinishReasonSchema = z.enum([
|
||||
'FINISH_REASON_UNSPECIFIED',
|
||||
'STOP',
|
||||
'MAX_TOKENS',
|
||||
'SAFETY',
|
||||
'RECITATION',
|
||||
'OTHER',
|
||||
]);
|
||||
|
||||
export const geminiGeneratedContentResponseSchema = z.object({
|
||||
// either all requested candidates are returned or no candidates at all
|
||||
// no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
|
||||
candidates: z.array(z.object({
|
||||
index: z.number(),
|
||||
content: geminiContentSchema,
|
||||
finishReason: geminiFinishReasonSchema.optional(),
|
||||
safetyRatings: z.array(geminiSafetyRatingSchema),
|
||||
citationMetadata: z.object({
|
||||
startIndex: z.number().optional(),
|
||||
endIndex: z.number().optional(),
|
||||
uri: z.string().optional(),
|
||||
license: z.string().optional(),
|
||||
}).optional(),
|
||||
tokenCount: z.number().optional(),
|
||||
// groundingAttributions: z.array(GroundingAttribution).optional(), // This field is populated for GenerateAnswer calls.
|
||||
})).optional(),
|
||||
// NOTE: promptFeedback is only send in the first chunk in a streaming response
|
||||
promptFeedback: z.object({
|
||||
blockReason: z.enum(['BLOCK_REASON_UNSPECIFIED', 'SAFETY', 'OTHER']).optional(),
|
||||
safetyRatings: z.array(geminiSafetyRatingSchema).optional(),
|
||||
}).optional(),
|
||||
});
|
||||
+239
-158
@@ -4,12 +4,30 @@ import { createParser as createEventsourceParser, EventSourceParseCallback, Even
|
||||
|
||||
import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE, serverFetchOrThrow } from '~/server/wire';
|
||||
|
||||
import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
|
||||
import type { OpenAIWire } from './openai.wiretypes';
|
||||
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
|
||||
import { ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
|
||||
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
|
||||
import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
|
||||
|
||||
// Anthropic server imports
|
||||
import type { AnthropicWire } from './anthropic/anthropic.wiretypes';
|
||||
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from './anthropic/anthropic.router';
|
||||
|
||||
// Gemini server imports
|
||||
import { geminiAccess, geminiAccessSchema, geminiGenerateContentTextPayload } from './gemini/gemini.router';
|
||||
import { geminiGeneratedContentResponseSchema, geminiModelsStreamGenerateContentPath } from './gemini/gemini.wiretypes';
|
||||
|
||||
// Ollama server imports
|
||||
import { wireOllamaChunkedOutputSchema } from './ollama/ollama.wiretypes';
|
||||
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from './ollama/ollama.router';
|
||||
|
||||
// OpenAI server imports
|
||||
import type { OpenAIWire } from './openai/openai.wiretypes';
|
||||
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
|
||||
|
||||
|
||||
/**
|
||||
* Event stream formats
|
||||
* - 'sse' is the default format, and is used by all vendors except Ollama
|
||||
* - 'json-nl' is used by Ollama
|
||||
*/
|
||||
type MuxingFormat = 'sse' | 'json-nl';
|
||||
|
||||
|
||||
/**
|
||||
@@ -20,77 +38,87 @@ import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
|
||||
* The peculiarity of our parser is the injection of a JSON structure at the beginning of the stream, to
|
||||
* communicate parameters before the text starts flowing to the client.
|
||||
*/
|
||||
export type AIStreamParser = (data: string) => { text: string, close: boolean };
|
||||
|
||||
type EventStreamFormat = 'sse' | 'json-nl';
|
||||
type AIStreamParser = (data: string) => { text: string, close: boolean };
|
||||
|
||||
|
||||
const chatStreamInputSchema = z.object({
|
||||
access: z.union([anthropicAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
|
||||
model: openAIModelSchema, history: openAIHistorySchema,
|
||||
const chatStreamingInputSchema = z.object({
|
||||
access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
|
||||
model: openAIModelSchema,
|
||||
history: openAIHistorySchema,
|
||||
});
|
||||
export type ChatStreamInputSchema = z.infer<typeof chatStreamInputSchema>;
|
||||
export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;
|
||||
|
||||
const chatStreamFirstPacketSchema = z.object({
|
||||
const chatStreamingFirstOutputPacketSchema = z.object({
|
||||
model: z.string(),
|
||||
});
|
||||
export type ChatStreamFirstPacketSchema = z.infer<typeof chatStreamFirstPacketSchema>;
|
||||
export type ChatStreamingFirstOutputPacketSchema = z.infer<typeof chatStreamingFirstOutputPacketSchema>;
|
||||
|
||||
|
||||
export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Response> {
|
||||
export async function llmStreamingRelayHandler(req: NextRequest): Promise<Response> {
|
||||
|
||||
// inputs - reuse the tRPC schema
|
||||
const { access, model, history } = chatStreamInputSchema.parse(await req.json());
|
||||
const body = await req.json();
|
||||
const { access, model, history } = chatStreamingInputSchema.parse(body);
|
||||
|
||||
// begin event streaming from the OpenAI API
|
||||
let headersUrl: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
|
||||
// access/dialect dependent setup:
|
||||
// - requestAccess: the headers and URL to use for the upstream API call
|
||||
// - muxingFormat: the format of the event stream (sse or json-nl)
|
||||
// - vendorStreamParser: the parser to use for the event stream
|
||||
let upstreamResponse: Response;
|
||||
let requestAccess: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
|
||||
let muxingFormat: MuxingFormat = 'sse';
|
||||
let vendorStreamParser: AIStreamParser;
|
||||
let eventStreamFormat: EventStreamFormat = 'sse';
|
||||
try {
|
||||
|
||||
// prepare the API request data
|
||||
let body: object;
|
||||
switch (access.dialect) {
|
||||
case 'anthropic':
|
||||
headersUrl = anthropicAccess(access, '/v1/complete');
|
||||
requestAccess = anthropicAccess(access, '/v1/complete');
|
||||
body = anthropicChatCompletionPayload(model, history, true);
|
||||
vendorStreamParser = createAnthropicStreamParser();
|
||||
vendorStreamParser = createStreamParserAnthropic();
|
||||
break;
|
||||
|
||||
case 'gemini':
|
||||
requestAccess = geminiAccess(access, model.id, geminiModelsStreamGenerateContentPath);
|
||||
body = geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1);
|
||||
vendorStreamParser = createStreamParserGemini(model.id.replace('models/', ''));
|
||||
break;
|
||||
|
||||
case 'ollama':
|
||||
headersUrl = ollamaAccess(access, '/api/generate');
|
||||
requestAccess = ollamaAccess(access, OLLAMA_PATH_CHAT);
|
||||
body = ollamaChatCompletionPayload(model, history, true);
|
||||
eventStreamFormat = 'json-nl';
|
||||
vendorStreamParser = createOllamaStreamParser();
|
||||
muxingFormat = 'json-nl';
|
||||
vendorStreamParser = createStreamParserOllama();
|
||||
break;
|
||||
|
||||
case 'azure':
|
||||
case 'localai':
|
||||
case 'mistral':
|
||||
case 'oobabooga':
|
||||
case 'openai':
|
||||
case 'openrouter':
|
||||
headersUrl = openAIAccess(access, model.id, '/v1/chat/completions');
|
||||
requestAccess = openAIAccess(access, model.id, '/v1/chat/completions');
|
||||
body = openAIChatCompletionPayload(model, history, null, null, 1, true);
|
||||
vendorStreamParser = createOpenAIStreamParser();
|
||||
vendorStreamParser = createStreamParserOpenAI();
|
||||
break;
|
||||
}
|
||||
|
||||
if (SERVER_DEBUG_WIRE)
|
||||
console.log('-> streaming:', debugGenerateCurlCommand('POST', headersUrl.url, headersUrl.headers, body));
|
||||
console.log('-> streaming:', debugGenerateCurlCommand('POST', requestAccess.url, requestAccess.headers, body));
|
||||
|
||||
// POST to our API route
|
||||
upstreamResponse = await serverFetchOrThrow(headersUrl.url, 'POST', headersUrl.headers, body);
|
||||
upstreamResponse = await serverFetchOrThrow(requestAccess.url, 'POST', requestAccess.headers, body);
|
||||
|
||||
} catch (error: any) {
|
||||
const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + error.cause : '');
|
||||
|
||||
// server-side admins message
|
||||
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, headersUrl?.url);
|
||||
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, requestAccess?.url);
|
||||
|
||||
// client-side users visible message
|
||||
return new NextResponse(`[Issue] ${access.dialect}: ${fetchOrVendorError}`
|
||||
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${headersUrl?.url}]` : ''), { status: 500 });
|
||||
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${requestAccess?.url}]` : ''), { status: 500 });
|
||||
}
|
||||
|
||||
/* The following code is heavily inspired by the Vercel AI SDK, but simplified to our needs and in full control.
|
||||
@@ -102,8 +130,12 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
|
||||
* NOTE: we have not benchmarked to see if there is performance impact by using this approach - we do want to have
|
||||
* a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
|
||||
*/
|
||||
const chatResponseStream = (upstreamResponse.body || createEmptyReadableStream())
|
||||
.pipeThrough(createEventStreamTransformer(vendorStreamParser, eventStreamFormat, access.dialect));
|
||||
const transformUpstreamToBigAgiClient = createEventStreamTransformer(
|
||||
muxingFormat, vendorStreamParser, access.dialect,
|
||||
);
|
||||
const chatResponseStream =
|
||||
(upstreamResponse.body || createEmptyReadableStream())
|
||||
.pipeThrough(transformUpstreamToBigAgiClient);
|
||||
|
||||
return new NextResponse(chatResponseStream, {
|
||||
status: 200,
|
||||
@@ -114,105 +146,44 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
|
||||
}
|
||||
|
||||
|
||||
/// Event Parsers
|
||||
|
||||
function createAnthropicStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: AnthropicWire.Complete.Response = JSON.parse(data);
|
||||
let text = json.completion;
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createOllamaStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
let wireGeneration: any;
|
||||
try {
|
||||
wireGeneration = JSON.parse(data);
|
||||
} catch (error: any) {
|
||||
// log the malformed data to the console, and rethrow to transmit as 'error'
|
||||
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
|
||||
throw error;
|
||||
}
|
||||
const generation = wireOllamaGenerationSchema.parse(wireGeneration);
|
||||
let text = generation.response;
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: generation.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: generation.done };
|
||||
};
|
||||
}
|
||||
|
||||
function createOpenAIStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
let hasWarned = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
|
||||
|
||||
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
|
||||
if (json.error)
|
||||
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
|
||||
|
||||
// [OpenAI] if there's a warning, log it once
|
||||
if (json.warning && !hasWarned) {
|
||||
hasWarned = true;
|
||||
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
|
||||
}
|
||||
|
||||
if (json.choices.length !== 1) {
|
||||
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
|
||||
if (json.id === '' && json.object === '' && json.model === '')
|
||||
return { text: '', close: false };
|
||||
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
|
||||
}
|
||||
|
||||
const index = json.choices[0].index;
|
||||
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
|
||||
throw new Error(`Expected completion index 0, got ${index}`);
|
||||
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
|
||||
const close = !!json.choices[0].finish_reason;
|
||||
return { text, close };
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
// Event Stream Transformers
|
||||
|
||||
/**
|
||||
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
|
||||
* Ollama is the only vendor that uses this format.
|
||||
*/
|
||||
function createDemuxerJsonNewline(onParse: EventSourceParseCallback): EventSourceParser {
|
||||
let accumulator: string = '';
|
||||
return {
|
||||
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
|
||||
feed: (chunk: string): void => {
|
||||
accumulator += chunk;
|
||||
if (accumulator.endsWith('\n')) {
|
||||
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
|
||||
const mimicEvent: ParsedEvent = {
|
||||
type: 'event',
|
||||
id: undefined,
|
||||
event: undefined,
|
||||
data: jsonString,
|
||||
};
|
||||
onParse(mimicEvent);
|
||||
}
|
||||
accumulator = '';
|
||||
}
|
||||
},
|
||||
|
||||
// resets the parser state - not useful with our driving of the parser
|
||||
reset: (): void => {
|
||||
console.error('createDemuxerJsonNewline.reset() not implemented');
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a TransformStream that parses events from an EventSource stream using a custom parser.
|
||||
* @returns {TransformStream<Uint8Array, string>} TransformStream parsing events.
|
||||
*/
|
||||
function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFormat: EventStreamFormat, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
|
||||
function createEventStreamTransformer(muxingFormat: MuxingFormat, vendorTextParser: AIStreamParser, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
|
||||
const textDecoder = new TextDecoder();
|
||||
const textEncoder = new TextEncoder();
|
||||
let eventSourceParser: EventSourceParser;
|
||||
@@ -248,16 +219,17 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
|
||||
if (close)
|
||||
controller.terminate();
|
||||
} catch (error: any) {
|
||||
// console.log(`/api/llms/stream: parse issue: ${error?.message || error}`);
|
||||
controller.enqueue(textEncoder.encode(`[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}`));
|
||||
if (SERVER_DEBUG_WIRE)
|
||||
console.log(' - E: parse issue:', event.data, error?.message || error);
|
||||
controller.enqueue(textEncoder.encode(` **[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}**`));
|
||||
controller.terminate();
|
||||
}
|
||||
};
|
||||
|
||||
if (inputFormat === 'sse')
|
||||
if (muxingFormat === 'sse')
|
||||
eventSourceParser = createEventsourceParser(onNewEvent);
|
||||
else if (inputFormat === 'json-nl')
|
||||
eventSourceParser = createJsonNewlineParser(onNewEvent);
|
||||
else if (muxingFormat === 'json-nl')
|
||||
eventSourceParser = createDemuxerJsonNewline(onNewEvent);
|
||||
},
|
||||
|
||||
// stream=true is set because the data is not guaranteed to be final and un-chunked
|
||||
@@ -267,33 +239,142 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
|
||||
* Ollama is the only vendor that uses this format.
|
||||
*/
|
||||
function createJsonNewlineParser(onParse: EventSourceParseCallback): EventSourceParser {
|
||||
let accumulator: string = '';
|
||||
return {
|
||||
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
|
||||
feed: (chunk: string): void => {
|
||||
accumulator += chunk;
|
||||
if (accumulator.endsWith('\n')) {
|
||||
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
|
||||
const mimicEvent: ParsedEvent = {
|
||||
type: 'event',
|
||||
id: undefined,
|
||||
event: undefined,
|
||||
data: jsonString,
|
||||
};
|
||||
onParse(mimicEvent);
|
||||
}
|
||||
accumulator = '';
|
||||
}
|
||||
},
|
||||
|
||||
// resets the parser state - not useful with our driving of the parser
|
||||
reset: (): void => {
|
||||
console.error('createJsonNewlineParser.reset() not implemented');
|
||||
},
|
||||
/// Stream Parsers
|
||||
|
||||
function createStreamParserAnthropic(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: AnthropicWire.Complete.Response = JSON.parse(data);
|
||||
let text = json.completion;
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserGemini(modelName: string): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
// this can throw, it's catched upstream
|
||||
return (data: string) => {
|
||||
|
||||
// parse the JSON chunk
|
||||
const wireGenerationChunk = JSON.parse(data);
|
||||
const generationChunk = geminiGeneratedContentResponseSchema.parse(wireGenerationChunk);
|
||||
|
||||
// Prompt Safety Errors: pass through errors from Gemini
|
||||
if (generationChunk.promptFeedback?.blockReason) {
|
||||
const { blockReason, safetyRatings } = generationChunk.promptFeedback;
|
||||
return { text: `[Gemini Prompt Blocked] ${blockReason}: ${JSON.stringify(safetyRatings || 'Unknown Safety Ratings', null, 2)}`, close: true };
|
||||
}
|
||||
|
||||
// expect a single completion
|
||||
const singleCandidate = generationChunk.candidates?.[0] ?? null;
|
||||
if (!singleCandidate || !singleCandidate.content?.parts.length)
|
||||
throw new Error(`Gemini: expected 1 completion, got ${generationChunk.candidates?.length}`);
|
||||
|
||||
// expect a single part
|
||||
if (singleCandidate.content.parts.length !== 1 || !('text' in singleCandidate.content.parts[0]))
|
||||
throw new Error(`Gemini: expected 1 text part, got ${singleCandidate.content.parts.length}`);
|
||||
|
||||
// expect a single text in the part
|
||||
let text = singleCandidate.content.parts[0].text || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: modelName };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserOllama(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
// parse the JSON chunk
|
||||
let wireJsonChunk: any;
|
||||
try {
|
||||
wireJsonChunk = JSON.parse(data);
|
||||
} catch (error: any) {
|
||||
// log the malformed data to the console, and rethrow to transmit as 'error'
|
||||
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
|
||||
throw error;
|
||||
}
|
||||
|
||||
// validate chunk
|
||||
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
|
||||
|
||||
// pass through errors from Ollama
|
||||
if ('error' in chunk)
|
||||
throw new Error(chunk.error);
|
||||
|
||||
// process output
|
||||
let text = chunk.message?.content || /*chunk.response ||*/ '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun && chunk.model) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: chunk.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: chunk.done };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserOpenAI(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
let hasWarned = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
|
||||
|
||||
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
|
||||
if (json.error)
|
||||
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
|
||||
|
||||
// [OpenAI] if there's a warning, log it once
|
||||
if (json.warning && !hasWarned) {
|
||||
hasWarned = true;
|
||||
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
|
||||
}
|
||||
|
||||
if (json.choices.length !== 1) {
|
||||
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
|
||||
if (json.id === '' && json.object === '' && json.model === '')
|
||||
return { text: '', close: false };
|
||||
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
|
||||
}
|
||||
|
||||
const index = json.choices[0].index;
|
||||
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
|
||||
throw new Error(`Expected completion index 0, got ${index}`);
|
||||
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
|
||||
const close = !!json.choices[0].finish_reason;
|
||||
return { text, close };
|
||||
};
|
||||
}
|
||||
+11
-1
@@ -1,11 +1,18 @@
|
||||
import { z } from 'zod';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../store-llms';
|
||||
|
||||
|
||||
// Model Description: a superset of LLM model descriptors
|
||||
|
||||
const pricingSchema = z.object({
|
||||
cpmPrompt: z.number().optional(), // Cost per thousand prompt tokens
|
||||
cpmCompletion: z.number().optional(), // Cost per thousand completion tokens
|
||||
});
|
||||
|
||||
// const rateLimitsSchema = z.object({
|
||||
// reqPerMinute: z.number().optional(),
|
||||
// });
|
||||
|
||||
const modelDescriptionSchema = z.object({
|
||||
id: z.string(),
|
||||
label: z.string(),
|
||||
@@ -15,9 +22,12 @@ const modelDescriptionSchema = z.object({
|
||||
contextWindow: z.number(),
|
||||
maxCompletionTokens: z.number().optional(),
|
||||
pricing: pricingSchema.optional(),
|
||||
// rateLimits: rateLimitsSchema.optional(),
|
||||
interfaces: z.array(z.enum([LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Complete, LLM_IF_OAI_Vision])),
|
||||
hidden: z.boolean().optional(),
|
||||
});
|
||||
|
||||
// this is also used by the Client
|
||||
export type ModelDescriptionSchema = z.infer<typeof modelDescriptionSchema>;
|
||||
|
||||
export const listModelsOutputSchema = z.object({
|
||||
+56
-48
@@ -3,54 +3,62 @@
|
||||
* descriptions for the models.
|
||||
* (nor does it reliably provide context window sizes) - TODO: open a bug upstream
|
||||
*
|
||||
* from: https://ollama.ai/library?sort=popular
|
||||
* from: https://ollama.ai/library?sort=featured
|
||||
*/
|
||||
export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
|
||||
'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 56100 },
|
||||
'llama2': { description: 'The most popular model for general use.', pulls: 117400 },
|
||||
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 61500 },
|
||||
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 26800 },
|
||||
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 23000 },
|
||||
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 20600 },
|
||||
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 12100 },
|
||||
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 9760 },
|
||||
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9002 },
|
||||
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 8671 },
|
||||
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8478 },
|
||||
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 8142 },
|
||||
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7426 },
|
||||
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7035 },
|
||||
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6140 },
|
||||
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 5865 },
|
||||
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5534 },
|
||||
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 4696 },
|
||||
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4275 },
|
||||
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4227 },
|
||||
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 3663, added: '20231129' },
|
||||
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3343 },
|
||||
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 3134, added: '20231129' },
|
||||
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3050 },
|
||||
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 2981 },
|
||||
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 2636 },
|
||||
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2328 },
|
||||
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 2281, added: '20231129' },
|
||||
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 2101 },
|
||||
'yi': { description: 'A high-performing, bilingual base model.', pulls: 1806 },
|
||||
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 1803 },
|
||||
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1605 },
|
||||
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks.', pulls: 1584 },
|
||||
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1560 },
|
||||
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 1338, added: '20231129' },
|
||||
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1253 },
|
||||
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1163 },
|
||||
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1099 },
|
||||
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1042 },
|
||||
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 728, added: '20231129' },
|
||||
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 593 },
|
||||
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 585 },
|
||||
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 573, added: '20231129' },
|
||||
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 446, added: '20231129' },
|
||||
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 100, added: '20231129' },
|
||||
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 11, added: '20231129' },
|
||||
'llama2': { description: 'The most popular model for general use.', pulls: 165600 },
|
||||
'mistral': { description: 'The 7B model released by Mistral AI, updated to version 0.2', pulls: 92200 },
|
||||
'llava': { description: '🌋 A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding.', pulls: 3563, added: '20231215' },
|
||||
'mixtral': { description: 'A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.', pulls: 8277, added: '20231215' },
|
||||
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 3657, added: '20231129' },
|
||||
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 4647, added: '20231129' },
|
||||
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 79800 },
|
||||
'dolphin-mixtral': { description: 'An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks. Created by Eric Hartford.', pulls: 48400, added: '20231215' },
|
||||
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 36600 },
|
||||
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 30000 },
|
||||
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 22700 },
|
||||
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 15300 },
|
||||
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 11500 },
|
||||
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 11200 },
|
||||
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 10700 },
|
||||
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 10200 },
|
||||
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9842 },
|
||||
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 9071 },
|
||||
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 8328 },
|
||||
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 8111 },
|
||||
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 7492, added: '20231129' },
|
||||
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 7468 },
|
||||
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6468 },
|
||||
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 6397 },
|
||||
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5347 },
|
||||
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 5034 },
|
||||
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4874 },
|
||||
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 4686 },
|
||||
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-1210.', pulls: 4496, added: '20231129' },
|
||||
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 4331 },
|
||||
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3722 },
|
||||
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3668 },
|
||||
'yi': { description: 'A high-performing, bilingual base model.', pulls: 3335 },
|
||||
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3219 },
|
||||
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 3087 },
|
||||
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2518 },
|
||||
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 2338 },
|
||||
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 2216, added: '20231129' },
|
||||
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 2201 },
|
||||
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 1983, added: '20231210' },
|
||||
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1790 },
|
||||
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1732, added: '20231129' },
|
||||
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1598 },
|
||||
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1534 },
|
||||
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1454 },
|
||||
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1418 },
|
||||
'phi': { description: 'Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.', pulls: 1304, added: '20231220' },
|
||||
'bakllava': { description: 'BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.', pulls: 1189, added: '20231215' },
|
||||
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 1140, added: '20231129' },
|
||||
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 1060 },
|
||||
'solar': { description: 'A compact, yet powerful 10.7B large language model designed for single-turn conversation.', pulls: 934 },
|
||||
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 902, added: '20231129' },
|
||||
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 868 },
|
||||
};
|
||||
export const OLLAMA_LAST_UPDATE: string = '20231129';
|
||||
// export const OLLAMA_LAST_UPDATE: string = '20231220';
|
||||
export const OLLAMA_PREV_UPDATE: string = '20231210';
|
||||
+47
-12
@@ -1,22 +1,26 @@
|
||||
import { z } from 'zod';
|
||||
import { TRPCError } from '@trpc/server';
|
||||
|
||||
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
|
||||
import { env } from '~/server/env.mjs';
|
||||
import { fetchJsonOrTRPCError, fetchTextOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { LLM_IF_OAI_Chat } from '../../../store-llms';
|
||||
import { LLM_IF_OAI_Chat } from '../../store-llms';
|
||||
|
||||
import { capitalizeFirstLetter } from '~/common/util/textUtils';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { OLLAMA_BASE_MODELS, OLLAMA_LAST_UPDATE } from './ollama.models';
|
||||
import { wireOllamaGenerationSchema } from './ollama.wiretypes';
|
||||
import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
|
||||
import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';
|
||||
|
||||
|
||||
// Default hosts
|
||||
const DEFAULT_OLLAMA_HOST = 'http://127.0.0.1:11434';
|
||||
export const OLLAMA_PATH_CHAT = '/api/chat';
|
||||
const OLLAMA_PATH_TAGS = '/api/tags';
|
||||
const OLLAMA_PATH_SHOW = '/api/show';
|
||||
|
||||
|
||||
// Mappers
|
||||
@@ -34,7 +38,23 @@ export function ollamaAccess(access: OllamaAccessSchema, apiPath: string): { hea
|
||||
|
||||
}
|
||||
|
||||
export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
|
||||
|
||||
export const ollamaChatCompletionPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean): WireOllamaChatCompletionInput => ({
|
||||
model: model.id,
|
||||
messages: history,
|
||||
options: {
|
||||
...(model.temperature && { temperature: model.temperature }),
|
||||
},
|
||||
// n: ...
|
||||
// functions: ...
|
||||
// function_call: ...
|
||||
stream,
|
||||
});
|
||||
|
||||
|
||||
/* Unused: switched to the Chat endpoint (above). The implementation is left here for reference.
|
||||
https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion
|
||||
export function ollamaCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
|
||||
|
||||
// if the first message is the system prompt, extract it
|
||||
let systemPrompt: string | undefined = undefined;
|
||||
@@ -62,7 +82,7 @@ export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: O
|
||||
...(systemPrompt && { system: systemPrompt }),
|
||||
stream,
|
||||
};
|
||||
}
|
||||
}*/
|
||||
|
||||
async function ollamaGET<TOut extends object>(access: OllamaAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
|
||||
const { headers, url } = ollamaAccess(access, apiPath);
|
||||
@@ -104,6 +124,7 @@ const listPullableOutputSchema = z.object({
|
||||
label: z.string(),
|
||||
tag: z.string(),
|
||||
description: z.string(),
|
||||
pulls: z.number(),
|
||||
isNew: z.boolean(),
|
||||
})),
|
||||
});
|
||||
@@ -122,7 +143,8 @@ export const llmOllamaRouter = createTRPCRouter({
|
||||
label: capitalizeFirstLetter(model_id),
|
||||
tag: 'latest',
|
||||
description: model.description,
|
||||
isNew: !!model.added && model.added >= OLLAMA_LAST_UPDATE,
|
||||
pulls: model.pulls,
|
||||
isNew: !!model.added && model.added >= OLLAMA_PREV_UPDATE,
|
||||
})),
|
||||
};
|
||||
}),
|
||||
@@ -160,6 +182,7 @@ export const llmOllamaRouter = createTRPCRouter({
|
||||
throw new Error('Ollama delete issue: ' + deleteOutput);
|
||||
}),
|
||||
|
||||
|
||||
/* Ollama: List the Models available */
|
||||
listModels: publicProcedure
|
||||
.input(accessOnlySchema)
|
||||
@@ -167,7 +190,7 @@ export const llmOllamaRouter = createTRPCRouter({
|
||||
.query(async ({ input }) => {
|
||||
|
||||
// get the models
|
||||
const wireModels = await ollamaGET(input.access, '/api/tags');
|
||||
const wireModels = await ollamaGET(input.access, OLLAMA_PATH_TAGS);
|
||||
const wireOllamaListModelsSchema = z.object({
|
||||
models: z.array(z.object({
|
||||
name: z.string(),
|
||||
@@ -180,7 +203,7 @@ export const llmOllamaRouter = createTRPCRouter({
|
||||
|
||||
// retrieve info for each of the models (/api/show, post call, in parallel)
|
||||
const detailedModels = await Promise.all(models.map(async model => {
|
||||
const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, '/api/show');
|
||||
const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, OLLAMA_PATH_SHOW);
|
||||
const wireOllamaModelInfoSchema = z.object({
|
||||
license: z.string().optional(),
|
||||
modelfile: z.string(),
|
||||
@@ -221,12 +244,24 @@ export const llmOllamaRouter = createTRPCRouter({
|
||||
.output(openAIChatGenerateOutputSchema)
|
||||
.mutation(async ({ input: { access, history, model } }) => {
|
||||
|
||||
const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), '/api/generate');
|
||||
const generation = wireOllamaGenerationSchema.parse(wireGeneration);
|
||||
const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), OLLAMA_PATH_CHAT);
|
||||
const generation = wireOllamaChunkedOutputSchema.parse(wireGeneration);
|
||||
|
||||
if ('error' in generation)
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Ollama chat-generation issue: ${generation.error}`,
|
||||
});
|
||||
|
||||
if (!generation.message?.content)
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Ollama chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
|
||||
});
|
||||
|
||||
return {
|
||||
role: 'assistant',
|
||||
content: generation.response,
|
||||
content: generation.message.content,
|
||||
finish_reason: generation.done ? 'stop' : null,
|
||||
};
|
||||
}),
|
||||
@@ -0,0 +1,76 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
|
||||
/**
|
||||
* Chat Completion API - Request
|
||||
* https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion
|
||||
*/
|
||||
const wireOllamaChatCompletionInputSchema = z.object({
|
||||
|
||||
// required
|
||||
model: z.string(),
|
||||
messages: z.array(z.object({
|
||||
role: z.enum(['assistant', 'system', 'user']),
|
||||
content: z.string(),
|
||||
})),
|
||||
|
||||
// optional
|
||||
format: z.enum(['json']).optional(),
|
||||
options: z.object({
|
||||
// https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md
|
||||
// Maximum number of tokens to predict when generating text.
|
||||
num_predict: z.number().int().optional(),
|
||||
// Sets the random number seed to use for generation
|
||||
seed: z.number().int().optional(),
|
||||
// The temperature of the model
|
||||
temperature: z.number().positive().optional(),
|
||||
// Reduces the probability of generating nonsense (Default: 40)
|
||||
top_k: z.number().positive().optional(),
|
||||
// Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text. (Default 0.9)
|
||||
top_p: z.number().positive().optional(),
|
||||
}).optional(),
|
||||
template: z.string().optional(), // overrides what is defined in the Modelfile
|
||||
stream: z.boolean().optional(), // default: true
|
||||
|
||||
// Future Improvements?
|
||||
// n: z.number().int().optional(), // number of completions to generate
|
||||
// functions: ...
|
||||
// function_call: ...
|
||||
});
|
||||
export type WireOllamaChatCompletionInput = z.infer<typeof wireOllamaChatCompletionInputSchema>;
|
||||
|
||||
|
||||
/**
|
||||
* Chat Completion or Generation APIs - Streaming Response
|
||||
*/
|
||||
export const wireOllamaChunkedOutputSchema = z.union([
|
||||
// Chat Completion Chunk
|
||||
z.object({
|
||||
model: z.string(),
|
||||
// created_at: z.string(), // commented because unused
|
||||
|
||||
// [Chat Completion] (exclusive with 'response')
|
||||
message: z.object({
|
||||
role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
|
||||
content: z.string(),
|
||||
}).optional(), // optional on the last message
|
||||
|
||||
// [Generation] (non-chat, exclusive with 'message')
|
||||
//response: z.string().optional(),
|
||||
|
||||
done: z.boolean(),
|
||||
|
||||
// only on the last message
|
||||
// context: z.array(z.number()), // non-chat endpoint
|
||||
// total_duration: z.number(),
|
||||
// prompt_eval_count: z.number(),
|
||||
// prompt_eval_duration: z.number(),
|
||||
// eval_count: z.number(),
|
||||
// eval_duration: z.number(),
|
||||
|
||||
}),
|
||||
// Possible Error
|
||||
z.object({
|
||||
error: z.string(),
|
||||
}),
|
||||
]);
|
||||
@@ -0,0 +1,33 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
|
||||
// [Mistral] Models List API - Response
|
||||
|
||||
export const wireMistralModelsListOutputSchema = z.object({
|
||||
id: z.string(),
|
||||
object: z.literal('model'),
|
||||
created: z.number(),
|
||||
owned_by: z.string(),
|
||||
root: z.null().optional(),
|
||||
parent: z.null().optional(),
|
||||
// permission: z.array(wireMistralModelsListPermissionsSchema)
|
||||
});
|
||||
|
||||
// export type WireMistralModelsListOutput = z.infer<typeof wireMistralModelsListOutputSchema>;
|
||||
|
||||
/*
|
||||
const wireMistralModelsListPermissionsSchema = z.object({
|
||||
id: z.string(),
|
||||
object: z.literal('model_permission'),
|
||||
created: z.number(),
|
||||
allow_create_engine: z.boolean(),
|
||||
allow_sampling: z.boolean(),
|
||||
allow_logprobs: z.boolean(),
|
||||
allow_search_indices: z.boolean(),
|
||||
allow_view: z.boolean(),
|
||||
allow_fine_tuning: z.boolean(),
|
||||
organization: z.string(),
|
||||
group: z.null().optional(),
|
||||
is_blocking: z.boolean()
|
||||
});
|
||||
*/
|
||||
+126
-39
@@ -1,5 +1,9 @@
|
||||
import type { ModelDescriptionSchema } from '../server.schemas';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
|
||||
import { SERVER_DEBUG_WIRE } from '~/server/wire';
|
||||
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
|
||||
import type { ModelDescriptionSchema } from '../llm.server.types';
|
||||
import { wireMistralModelsListOutputSchema } from './mistral.wiretypes';
|
||||
|
||||
|
||||
// [Azure] / [OpenAI]
|
||||
@@ -203,6 +207,63 @@ export function localAIModelToModelDescription(modelId: string): ModelDescriptio
|
||||
}
|
||||
|
||||
|
||||
// [Mistral]
|
||||
|
||||
const _knownMistralChatModels: ManualMappings = [
|
||||
{
|
||||
idPrefix: 'mistral-medium',
|
||||
label: 'Mistral Medium',
|
||||
description: 'Mistral internal prototype model.',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
},
|
||||
{
|
||||
idPrefix: 'mistral-small',
|
||||
label: 'Mistral Small',
|
||||
description: 'Higher reasoning capabilities and more capabilities (English, French, German, Italian, Spanish, and Code)',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
},
|
||||
{
|
||||
idPrefix: 'mistral-tiny',
|
||||
label: 'Mistral Tiny',
|
||||
description: 'Used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat],
|
||||
},
|
||||
{
|
||||
idPrefix: 'mistral-embed',
|
||||
label: 'Mistral Embed',
|
||||
description: 'Mistral Medium on Mistral',
|
||||
// output: 1024 dimensions
|
||||
maxCompletionTokens: 1024, // HACK - it's 1024 dimensions, but those are not 'completion tokens'
|
||||
contextWindow: 32768, // actually unknown, assumed from the other models
|
||||
interfaces: [],
|
||||
hidden: true,
|
||||
},
|
||||
];
|
||||
|
||||
export function mistralModelToModelDescription(_model: unknown): ModelDescriptionSchema {
|
||||
const model = wireMistralModelsListOutputSchema.parse(_model);
|
||||
return fromManualMapping(_knownMistralChatModels, model.id, model.created, undefined, {
|
||||
idPrefix: model.id,
|
||||
label: model.id.replaceAll(/[_-]/g, ' '),
|
||||
description: 'New Mistral Model',
|
||||
contextWindow: 32768,
|
||||
interfaces: [LLM_IF_OAI_Chat], // assume..
|
||||
hidden: true,
|
||||
});
|
||||
}
|
||||
|
||||
export function mistralModelsSort(a: ModelDescriptionSchema, b: ModelDescriptionSchema): number {
|
||||
if (a.hidden && !b.hidden)
|
||||
return 1;
|
||||
if (!a.hidden && b.hidden)
|
||||
return -1;
|
||||
return a.id.localeCompare(b.id);
|
||||
}
|
||||
|
||||
|
||||
// [Oobabooga]
|
||||
const _knownOobaboogaChatModels: ManualMappings = [];
|
||||
|
||||
@@ -236,8 +297,8 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
|
||||
/**
|
||||
* Created to reflect the doc page: https://openrouter.ai/docs
|
||||
*
|
||||
* Update prompt:
|
||||
* "Please update the typescript object below (do not change the definition, just the object), based on the updated upstream documentation:"
|
||||
* Update prompt (last updated 2023-12-12)
|
||||
* "Please update the following typescript object (do not change the definition, just values, and do not miss any rows), based on the information provided thereafter:"
|
||||
*
|
||||
* fields:
|
||||
* - cw: context window size (max tokens, total)
|
||||
@@ -247,19 +308,24 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
|
||||
*/
|
||||
const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?: number; old?: boolean; unfilt?: boolean; } } = {
|
||||
// 'openrouter/auto': { name: 'Auto (best for prompt)', cw: 128000, cp: undefined, cc: undefined, unfilt: undefined },
|
||||
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
|
||||
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B (beta)', cw: 4096, cp: 0, cc: 0, unfilt: true },
|
||||
'openchat/openchat-7b': { name: 'OpenChat 7B (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
|
||||
'undi95/toppy-m-7b': { name: 'Toppy M 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'gryphe/mythomist-7b': { name: 'MythoMist 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B (beta)', cw: 4096, cp: 0.000155, cc: 0.000155, unfilt: true },
|
||||
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct (beta)', cw: 8192, cp: 0.00045, cc: 0.00045, unfilt: true },
|
||||
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2 (beta)', cw: 4096, cp: 0.00045, cc: 0.00045, unfilt: true },
|
||||
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1 (beta)', cw: 32768, cp: 0.005, cc: 0.005, unfilt: true },
|
||||
'haotian-liu/llava-13b': { name: 'Llava 13B (beta)', cw: 2048, cp: 0.005, cc: 0.005, unfilt: true },
|
||||
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat (beta)', cw: 4096, cp: 0.000234533, cc: 0.000234533, unfilt: true },
|
||||
'alpindale/goliath-120b': { name: 'Goliath 120B (beta)', cw: 6144, cp: 0.00703125, cc: 0.00703125, unfilt: true },
|
||||
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B (beta)', cw: 4096, cp: 0.000562, cc: 0.000762, unfilt: true },
|
||||
'nousresearch/nous-capybara-7b': { name: 'Nous: Capybara 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
|
||||
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct', cw: 8192, cp: 0, cc: 0, unfilt: true },
|
||||
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
|
||||
'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
|
||||
'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
|
||||
'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
|
||||
'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
|
||||
'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
|
||||
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
|
||||
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32768, cp: 0.0003, cc: 0.0003, unfilt: true },
|
||||
'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
|
||||
'openai/gpt-3.5-turbo': { name: 'OpenAI: GPT-3.5 Turbo', cw: 4095, cp: 0.001, cc: 0.002, unfilt: false },
|
||||
'openai/gpt-3.5-turbo-1106': { name: 'OpenAI: GPT-3.5 Turbo 16k (preview)', cw: 16385, cp: 0.001, cc: 0.002, unfilt: false },
|
||||
'openai/gpt-3.5-turbo-16k': { name: 'OpenAI: GPT-3.5 Turbo 16k', cw: 16385, cp: 0.003, cc: 0.004, unfilt: false },
|
||||
@@ -268,28 +334,44 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'openai/gpt-4-32k': { name: 'OpenAI: GPT-4 32k', cw: 32767, cp: 0.06, cc: 0.12, unfilt: false },
|
||||
'openai/gpt-4-vision-preview': { name: 'OpenAI: GPT-4 Vision (preview)', cw: 128000, cp: 0.01, cc: 0.03, unfilt: false },
|
||||
'openai/gpt-3.5-turbo-instruct': { name: 'OpenAI: GPT-3.5 Turbo Instruct', cw: 4095, cp: 0.0015, cc: 0.002, unfilt: false },
|
||||
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 9216, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B (beta)', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
|
||||
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B (beta)', cw: 32000, cp: 0.02, cc: 0.02, unfilt: true },
|
||||
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'migtissera/synthia-70b': { name: 'Synthia 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
|
||||
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B (beta)', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B (beta)', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
|
||||
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'neversleep/noromaid-20b': { name: 'Noromaid 20B (beta)', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
|
||||
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 36864, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 28672, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/gemini-pro': { name: 'Google: Gemini Pro (preview)', cw: 131040, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/gemini-pro-vision': { name: 'Google: Gemini Pro Vision (preview)', cw: 65536, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
|
||||
'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
|
||||
'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
|
||||
'perplexity/pplx-70b-chat': { name: 'Perplexity: PPLX 70B Chat', cw: 4096, cp: 0.0007, cc: 0.0028, unfilt: true },
|
||||
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
|
||||
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
|
||||
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
|
||||
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0001425006, cc: 0.0001425006, unfilt: true },
|
||||
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
|
||||
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
'undi95/toppy-m-7b': { name: 'Toppy M 7B', cw: 32768, cp: 0.000375, cc: 0.000375, unfilt: true },
|
||||
'alpindale/goliath-120b': { name: 'Goliath 120B', cw: 6144, cp: 0.009375, cc: 0.009375, unfilt: true },
|
||||
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'neversleep/noromaid-20b': { name: 'Noromaid 20B', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
|
||||
'01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
|
||||
'01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
|
||||
'01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
|
||||
'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32768, cp: 0.0006, cc: 0.0006, unfilt: true },
|
||||
'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
|
||||
'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
|
||||
'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
|
||||
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.0045, cc: 0.0045, unfilt: true },
|
||||
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.003375, cc: 0.003375, unfilt: true },
|
||||
'gryphe/mythomax-l2-13b': { name: 'MythoMax 13B', cw: 4096, cp: 0.0006, cc: 0.0006, unfilt: true },
|
||||
// Old models (maintained for reference)
|
||||
'openai/gpt-3.5-turbo-0301': { name: 'OpenAI: GPT-3.5 Turbo (older v0301)', cw: 4095, cp: 0.001, cc: 0.002, old: true },
|
||||
'openai/gpt-4-0314': { name: 'OpenAI: GPT-4 (older v0314)', cw: 8191, cp: 0.03, cc: 0.06, old: true },
|
||||
'openai/gpt-4-32k-0314': { name: 'OpenAI: GPT-4 32k (older v0314)', cw: 32767, cp: 0.06, cc: 0.12, old: true },
|
||||
@@ -301,7 +383,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'anthropic/claude-instant-1.0': { name: 'Anthropic: Claude Instant (older v1)', cw: 9000, cp: 0.00163, cc: 0.00551, old: true },
|
||||
};
|
||||
|
||||
const orModelFamilyOrder = ['mistralai/', 'huggingfaceh4/', 'undi95/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/', 'openrouter/'];
|
||||
const orModelFamilyOrder = [
|
||||
// great models (pickes by hand, they're free)
|
||||
'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
|
||||
// great orgs
|
||||
'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'mistralai/', 'openai/', 'meta-llama/', 'phind/',
|
||||
];
|
||||
|
||||
export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
|
||||
const aPrefixIndex = orModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
|
||||
@@ -321,10 +408,10 @@ export function openRouterModelToModelDescription(modelId: string, created: numb
|
||||
const orModel = orModelMap[modelId] ?? null;
|
||||
let label = orModel?.name || modelId.replace('/', ' · ');
|
||||
if (orModel?.cp === 0 && orModel?.cc === 0)
|
||||
label += ' - 🎁 Free';
|
||||
label += ' · 🎁'; // Free? Discounted?
|
||||
|
||||
// if (!orModel)
|
||||
// console.log('openRouterModelToModelDescription: unknown model id:', modelId);
|
||||
if (SERVER_DEBUG_WIRE && !orModel)
|
||||
console.log(' - openRouterModelToModelDescription: non-mapped model id:', modelId);
|
||||
|
||||
// context: use the known size if available, otherwise fallback to the (undocumneted) provided length or fallback again to 4096
|
||||
const contextWindow = orModel?.cw || context_length || 4096;
|
||||
+25
-4
@@ -8,13 +8,13 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
import { Brand } from '~/common/app.config';
|
||||
|
||||
import type { OpenAIWire } from './openai.wiretypes';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
|
||||
import { localAIModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
import { localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
|
||||
|
||||
|
||||
// Input Schemas
|
||||
|
||||
const openAIDialects = z.enum(['azure', 'localai', 'oobabooga', 'openai', 'openrouter']);
|
||||
const openAIDialects = z.enum(['azure', 'localai', 'mistral', 'oobabooga', 'openai', 'openrouter']);
|
||||
|
||||
export const openAIAccessSchema = z.object({
|
||||
dialect: openAIDialects,
|
||||
@@ -186,12 +186,18 @@ export const llmOpenAIRouter = createTRPCRouter({
|
||||
.map((model): ModelDescriptionSchema => openAIModelToModelDescription(model.id, model.created));
|
||||
break;
|
||||
|
||||
case 'mistral':
|
||||
models = openAIModels
|
||||
.map(mistralModelToModelDescription)
|
||||
.sort(mistralModelsSort);
|
||||
break;
|
||||
|
||||
case 'openrouter':
|
||||
models = openAIModels
|
||||
.sort(openRouterModelFamilySortFn)
|
||||
.map(model => openRouterModelToModelDescription(model.id, model.created, (model as any)?.['context_length']));
|
||||
break;
|
||||
|
||||
}
|
||||
|
||||
return { models };
|
||||
@@ -267,9 +273,10 @@ async function openaiPOST<TOut extends object, TPostBody extends object>(access:
|
||||
}
|
||||
|
||||
|
||||
const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';
|
||||
const DEFAULT_MISTRAL_HOST = 'https://api.mistral.ai';
|
||||
const DEFAULT_OPENAI_HOST = 'api.openai.com';
|
||||
const DEFAULT_OPENROUTER_HOST = 'https://openrouter.ai/api';
|
||||
const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';
|
||||
|
||||
export function fixupHost(host: string, apiPath: string): string {
|
||||
if (!host.startsWith('http'))
|
||||
@@ -361,6 +368,20 @@ export function openAIAccess(access: OpenAIAccessSchema, modelRefId: string | nu
|
||||
};
|
||||
|
||||
|
||||
case 'mistral':
|
||||
// https://docs.mistral.ai/platform/client
|
||||
const mistralKey = access.oaiKey || env.MISTRAL_API_KEY || '';
|
||||
const mistralHost = fixupHost(access.oaiHost || DEFAULT_MISTRAL_HOST, apiPath);
|
||||
return {
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'Accept': 'application/json',
|
||||
'Authorization': `Bearer ${mistralKey}`,
|
||||
},
|
||||
url: mistralHost + apiPath,
|
||||
};
|
||||
|
||||
|
||||
case 'openrouter':
|
||||
const orKey = access.oaiKey || env.OPENROUTER_API_KEY || '';
|
||||
const orHost = fixupHost(access.oaiHost || DEFAULT_OPENROUTER_HOST, apiPath);
|
||||
@@ -2,7 +2,8 @@ import { create } from 'zustand';
|
||||
import { shallow } from 'zustand/shallow';
|
||||
import { persist } from 'zustand/middleware';
|
||||
|
||||
import { ModelVendorId } from './vendors/IModelVendor';
|
||||
import type { ModelVendorId } from './vendors/vendors.registry';
|
||||
import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';
|
||||
|
||||
|
||||
/**
|
||||
@@ -15,6 +16,7 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
|
||||
updated?: number | 0;
|
||||
description: string;
|
||||
tags: string[]; // UNUSED for now
|
||||
// modelcaps: DModelCapability[];
|
||||
contextTokens: number;
|
||||
maxOutputTokens: number;
|
||||
hidden: boolean;
|
||||
@@ -29,6 +31,17 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
|
||||
|
||||
export type DLLMId = string;
|
||||
|
||||
// export type DModelCapability =
|
||||
// | 'input-text'
|
||||
// | 'input-image-data'
|
||||
// | 'input-multipart'
|
||||
// | 'output-text'
|
||||
// | 'output-function'
|
||||
// | 'output-image-data'
|
||||
// | 'if-chat'
|
||||
// | 'if-fast-chat'
|
||||
// ;
|
||||
|
||||
// Model interfaces (chat, and function calls) - here as a preview, will be used more broadly in the future
|
||||
export const LLM_IF_OAI_Chat = 'oai-chat';
|
||||
export const LLM_IF_OAI_Vision = 'oai-vision';
|
||||
@@ -76,6 +89,9 @@ interface ModelsActions {
|
||||
setChatLLMId: (id: DLLMId | null) => void;
|
||||
setFastLLMId: (id: DLLMId | null) => void;
|
||||
setFuncLLMId: (id: DLLMId | null) => void;
|
||||
|
||||
// special
|
||||
setOpenRoutersKey: (key: string) => void;
|
||||
}
|
||||
|
||||
type LlmsStore = ModelsData & ModelsActions;
|
||||
@@ -162,13 +178,22 @@ export const useModelsStore = create<LlmsStore>()(
|
||||
set(state => ({
|
||||
sources: state.sources.map((source: DModelSource): DModelSource =>
|
||||
source.id === id
|
||||
? {
|
||||
...source,
|
||||
setup: { ...source.setup, ...partialSetup },
|
||||
} : source,
|
||||
? { ...source, setup: { ...source.setup, ...partialSetup } }
|
||||
: source,
|
||||
),
|
||||
})),
|
||||
|
||||
setOpenRoutersKey: (key: string) =>
|
||||
set(state => {
|
||||
const openRouterSource = state.sources.find(source => source.vId === 'openrouter');
|
||||
if (!openRouterSource) return state;
|
||||
return {
|
||||
sources: state.sources.map(source => source.id === openRouterSource.id
|
||||
? { ...source, setup: { ...source.setup, oaiKey: key satisfies SourceSetupOpenRouter['oaiKey'] } }
|
||||
: source),
|
||||
};
|
||||
}),
|
||||
|
||||
}),
|
||||
{
|
||||
name: 'app-models',
|
||||
@@ -256,24 +281,3 @@ export function useChatLLM() {
|
||||
}, shallow);
|
||||
}
|
||||
|
||||
/**
|
||||
* Source-specific read/write - great time saver
|
||||
*/
|
||||
export function useSourceSetup<TSourceSetup, TAccess>(sourceId: DModelSourceId, getAccess: (partialSetup?: Partial<TSourceSetup>) => TAccess) {
|
||||
// invalidate when the setup changes
|
||||
const { updateSourceSetup, ...rest } = useModelsStore(state => {
|
||||
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) ?? null;
|
||||
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
|
||||
return {
|
||||
source,
|
||||
sourceLLMs,
|
||||
sourceHasLLMs: !!sourceLLMs.length,
|
||||
access: getAccess(source?.setup),
|
||||
updateSourceSetup: state.updateSourceSetup,
|
||||
};
|
||||
}, shallow);
|
||||
|
||||
// convenience function for this source
|
||||
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
|
||||
return { ...rest, updateSetup };
|
||||
}
|
||||
@@ -1,34 +0,0 @@
|
||||
import type { DLLMId } from '../store-llms';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
|
||||
|
||||
|
||||
export interface VChatMessageIn {
|
||||
role: 'assistant' | 'system' | 'user'; // | 'function';
|
||||
content: string;
|
||||
//name?: string; // when role: 'function'
|
||||
}
|
||||
|
||||
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
|
||||
|
||||
export interface VChatMessageOut {
|
||||
role: 'assistant' | 'system' | 'user';
|
||||
content: string;
|
||||
finish_reason: 'stop' | 'length' | null;
|
||||
}
|
||||
|
||||
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
|
||||
function_name: string;
|
||||
function_arguments: object | null;
|
||||
}
|
||||
|
||||
|
||||
export async function callChatGenerate(llmId: DLLMId, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
return await vendor.callChatGenerate(llm, messages, maxTokens);
|
||||
}
|
||||
|
||||
export async function callChatGenerateWithFunctions(llmId: DLLMId, messages: VChatMessageIn[], functions: VChatFunctionIn[], forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
return await vendor.callChatGenerateWF(llm, messages, functions, forceFunctionName, maxTokens);
|
||||
}
|
||||
@@ -1,16 +0,0 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
export const wireOllamaGenerationSchema = z.object({
|
||||
model: z.string(),
|
||||
// created_at: z.string(), // commented because unused
|
||||
response: z.string(),
|
||||
done: z.boolean(),
|
||||
|
||||
// only on the last message
|
||||
// context: z.array(z.number()),
|
||||
// total_duration: z.number(),
|
||||
// load_duration: z.number(),
|
||||
// eval_duration: z.number(),
|
||||
// prompt_eval_count: z.number(),
|
||||
// eval_count: z.number(),
|
||||
});
|
||||
+37
-12
@@ -1,18 +1,19 @@
|
||||
import type React from 'react';
|
||||
import type { TRPCClientErrorBase } from '@trpc/client';
|
||||
|
||||
import type { DLLM, DModelSourceId } from '../store-llms';
|
||||
import { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../transports/chatGenerate';
|
||||
import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
|
||||
import type { ModelDescriptionSchema } from '../server/llm.server.types';
|
||||
import type { ModelVendorId } from './vendors.registry';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export type ModelVendorId = 'anthropic' | 'azure' | 'localai' | 'ollama' | 'oobabooga' | 'openai' | 'openrouter';
|
||||
|
||||
|
||||
export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAccess = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
|
||||
export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
|
||||
readonly id: ModelVendorId;
|
||||
readonly name: string;
|
||||
readonly rank: number;
|
||||
readonly location: 'local' | 'cloud';
|
||||
readonly instanceLimit: number;
|
||||
readonly hasFreeModels?: boolean;
|
||||
readonly hasBackendCap?: () => boolean;
|
||||
|
||||
// components
|
||||
@@ -20,12 +21,36 @@ export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAc
|
||||
readonly SourceSetupComponent: React.ComponentType<{ sourceId: DModelSourceId }>;
|
||||
readonly LLMOptionsComponent: React.ComponentType<{ llm: TDLLM }>;
|
||||
|
||||
// functions
|
||||
readonly initializeSetup?: () => TSourceSetup;
|
||||
/// abstraction interface ///
|
||||
|
||||
getAccess(setup?: Partial<TSourceSetup>): TAccess;
|
||||
initializeSetup?(): TSourceSetup;
|
||||
|
||||
callChatGenerate(llm: TDLLM, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut>;
|
||||
validateSetup?(setup: TSourceSetup): boolean;
|
||||
|
||||
callChatGenerateWF(llm: TDLLM, messages: VChatMessageIn[], functions: null | VChatFunctionIn[], forceFunctionName: null | string, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut>;
|
||||
}
|
||||
getTransportAccess(setup?: Partial<TSourceSetup>): TAccess;
|
||||
|
||||
rpcUpdateModelsQuery: (
|
||||
access: TAccess,
|
||||
enabled: boolean,
|
||||
onSuccess: (data: { models: ModelDescriptionSchema[] }) => void,
|
||||
) => { isFetching: boolean, refetch: () => void, isError: boolean, error: TRPCClientErrorBase<any> | null };
|
||||
|
||||
rpcChatGenerateOrThrow: (
|
||||
access: TAccess,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
|
||||
|
||||
streamingChatGenerateOrThrow: (
|
||||
access: TAccess,
|
||||
llmId: DLLMId,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
) => Promise<void>;
|
||||
|
||||
}
|
||||
|
||||
+6
-12
@@ -7,11 +7,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidAnthropicApiKey, ModelVendorAnthropic } from './anthropic.vendor';
|
||||
|
||||
@@ -23,7 +23,7 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceHasLLMs, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorAnthropic.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorAnthropic);
|
||||
|
||||
// derived state
|
||||
const { anthropicKey, anthropicHost, heliconeKey } = access;
|
||||
@@ -34,14 +34,8 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = anthropicKey ? keyValid : (!needsUserKey || !!anthropicHost);
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmAnthropic.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorAnthropic, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+43
-37
@@ -1,11 +1,12 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { AnthropicIcon } from '~/common/components/icons/AnthropicIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { AnthropicAccessSchema } from '../../transports/server/anthropic/anthropic.router';
|
||||
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
@@ -14,7 +15,7 @@ import { AnthropicSourceSetup } from './AnthropicSourceSetup';
|
||||
|
||||
|
||||
// special symbols
|
||||
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length > 40 : apiKey.length >= 40);
|
||||
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length >= 39 : apiKey.length >= 40);
|
||||
|
||||
export interface SourceSetupAnthropic {
|
||||
anthropicKey: string;
|
||||
@@ -22,7 +23,7 @@ export interface SourceSetupAnthropic {
|
||||
heliconeKey: string;
|
||||
}
|
||||
|
||||
export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptionsOpenAI, AnthropicAccessSchema> = {
|
||||
export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'anthropic',
|
||||
name: 'Anthropic',
|
||||
rank: 13,
|
||||
@@ -36,43 +37,48 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptions
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
getAccess: (partialSetup): AnthropicAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): AnthropicAccessSchema => ({
|
||||
dialect: 'anthropic',
|
||||
anthropicKey: partialSetup?.anthropicKey || '',
|
||||
anthropicHost: partialSetup?.anthropicHost || null,
|
||||
heliconeKey: partialSetup?.heliconeKey || null,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return anthropicCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, /*null, null,*/ maxTokens);
|
||||
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmAnthropic.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
|
||||
throw new Error('Anthropic does not support "Functions" yet');
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Anthropic does not support functions');
|
||||
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmAnthropic.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
|
||||
console.error(`anthropic.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or function calls, or throws a descriptive error string
|
||||
*/
|
||||
async function anthropicCallChatGenerate<TOut = VChatMessageOut>(
|
||||
access: AnthropicAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
// functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmAnthropic.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
|
||||
console.error(`anthropicCallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
+6
-12
@@ -5,11 +5,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { asValidURL } from '~/common/util/urlUtils';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidAzureApiKey, ModelVendorAzure } from './azure.vendor';
|
||||
|
||||
@@ -18,7 +18,7 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceHasLLMs, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorAzure.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorAzure);
|
||||
|
||||
// derived state
|
||||
const { oaiKey: azureKey, oaiHost: azureEndpoint } = access;
|
||||
@@ -31,14 +31,8 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = azureKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorAzure, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+9
-11
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
|
||||
import { AzureIcon } from '~/common/components/icons/AzureIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { AzureSourceSetup } from './AzureSourceSetup';
|
||||
@@ -36,7 +35,7 @@ export interface SourceSetupAzure {
|
||||
*
|
||||
* Work in progress...
|
||||
*/
|
||||
export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI, OpenAIAccessSchema> = {
|
||||
export const ModelVendorAzure: IModelVendor<SourceSetupAzure, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'azure',
|
||||
name: 'Azure',
|
||||
rank: 14,
|
||||
@@ -50,7 +49,7 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
getAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
dialect: 'azure',
|
||||
oaiKey: partialSetup?.azureKey || '',
|
||||
oaiOrg: '',
|
||||
@@ -58,10 +57,9 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport ('azure' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
@@ -0,0 +1,96 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
|
||||
import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';
|
||||
|
||||
import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
|
||||
import type { DModelSourceId } from '../../store-llms';
|
||||
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorGemini } from './gemini.vendor';
|
||||
|
||||
|
||||
const GEMINI_API_KEY_LINK = 'https://makersuite.google.com/app/apikey';
|
||||
|
||||
const SAFETY_OPTIONS: { value: GeminiBlockSafetyLevel, label: string }[] = [
|
||||
{ value: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED', label: 'Default' },
|
||||
{ value: 'BLOCK_LOW_AND_ABOVE', label: 'Low and above' },
|
||||
{ value: 'BLOCK_MEDIUM_AND_ABOVE', label: 'Medium and above' },
|
||||
{ value: 'BLOCK_ONLY_HIGH', label: 'Only high' },
|
||||
{ value: 'BLOCK_NONE', label: 'None' },
|
||||
];
|
||||
|
||||
|
||||
export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceSetupValid, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorGemini);
|
||||
|
||||
// derived state
|
||||
const { geminiKey, minSafetyLevel } = access;
|
||||
|
||||
const needsUserKey = !ModelVendorGemini.hasBackendCap?.();
|
||||
const shallFetchSucceed = !needsUserKey || (!!geminiKey && sourceSetupValid);
|
||||
const showKeyError = !!geminiKey && !sourceSetupValid;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorGemini, access, shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
<FormInputKey
|
||||
id='gemini-key' label='Gemini API Key'
|
||||
rightLabel={<>{needsUserKey
|
||||
? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
|
||||
: '✔️ already set in server'}
|
||||
</>}
|
||||
value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
|
||||
required={needsUserKey} isError={showKeyError}
|
||||
placeholder='...'
|
||||
/>
|
||||
|
||||
<FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
|
||||
<FormLabelStart title='Safety Settings'
|
||||
description='Threshold' />
|
||||
<Select
|
||||
variant='outlined'
|
||||
value={minSafetyLevel} onChange={(_event, value) => value && updateSetup({ minSafetyLevel: value })}
|
||||
startDecorator={<HealthAndSafetyIcon sx={{ display: { xs: 'none', sm: 'inherit' } }} />}
|
||||
// indicator={<KeyboardArrowDownIcon />}
|
||||
slotProps={{
|
||||
root: { sx: { width: '100%' } },
|
||||
indicator: { sx: { opacity: 0.5 } },
|
||||
button: { sx: { whiteSpace: 'inherit' } },
|
||||
}}
|
||||
>
|
||||
{SAFETY_OPTIONS.map(option => (
|
||||
<Option key={'gemini-safety-' + option.value} value={option.value}>{option.label}</Option>
|
||||
))}
|
||||
</Select>
|
||||
</FormControl>
|
||||
|
||||
<FormHelperText sx={{ display: 'block' }}>
|
||||
Gemini has <Link href='https://ai.google.dev/docs/safety_setting_gemini' target='_blank' noLinkStyle>
|
||||
adjustable safety settings</Link> on four categories: Harassment, Hate speech,
|
||||
Sexually explicit, and Dangerous content, in addition to non-adjustable built-in filters.
|
||||
By default, the model will block content with <em>medium and above</em> probability
|
||||
of being unsafe.
|
||||
</FormHelperText>
|
||||
|
||||
<SetupFormRefetchButton
|
||||
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
|
||||
/>
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
</>;
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
import GoogleIcon from '@mui/icons-material/Google';
|
||||
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
|
||||
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { GeminiSourceSetup } from './GeminiSourceSetup';
|
||||
|
||||
|
||||
export interface SourceSetupGemini {
|
||||
geminiKey: string;
|
||||
minSafetyLevel: GeminiBlockSafetyLevel;
|
||||
}
|
||||
|
||||
export interface LLMOptionsGemini {
|
||||
llmRef: string;
|
||||
stopSequences: string[]; // up to 5 sequences that will stop generation (optional)
|
||||
candidateCount: number; // 1...8 number of generated responses to return (optional)
|
||||
maxOutputTokens: number; // if unset, this will default to outputTokenLimit (optional)
|
||||
temperature: number; // 0...1 Controls the randomness of the output. (optional)
|
||||
topP: number; // 0...1 The maximum cumulative probability of tokens to consider when sampling (optional)
|
||||
topK: number; // 1...100 The maximum number of tokens to consider when sampling (optional)
|
||||
}
|
||||
|
||||
|
||||
export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSchema, LLMOptionsGemini> = {
|
||||
id: 'googleai',
|
||||
name: 'Gemini',
|
||||
rank: 11,
|
||||
location: 'cloud',
|
||||
instanceLimit: 1,
|
||||
hasBackendCap: () => backendCaps().hasLlmGemini,
|
||||
|
||||
// components
|
||||
Icon: GoogleIcon,
|
||||
SourceSetupComponent: GeminiSourceSetup,
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
initializeSetup: () => ({
|
||||
geminiKey: '',
|
||||
minSafetyLevel: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
}),
|
||||
validateSetup: (setup) => {
|
||||
return setup.geminiKey?.length > 0;
|
||||
},
|
||||
getTransportAccess: (partialSetup): GeminiAccessSchema => ({
|
||||
dialect: 'gemini',
|
||||
geminiKey: partialSetup?.geminiKey || '',
|
||||
minSafetyLevel: partialSetup?.minSafetyLevel || 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
}),
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmGemini.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Gemini does not support functions');
|
||||
|
||||
const { llmRef, temperature = 0.5, maxOutputTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmGemini.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: temperature,
|
||||
maxTokens: maxTokens || maxOutputTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
|
||||
console.error(`gemini.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
+6
-12
@@ -7,10 +7,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorLocalAI } from './localai.vendor';
|
||||
|
||||
@@ -19,7 +19,7 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorLocalAI.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorLocalAI);
|
||||
|
||||
// derived state
|
||||
const { oaiHost } = access;
|
||||
@@ -30,14 +30,8 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = isValidHost;
|
||||
|
||||
// fetch models - the OpenAI way
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorLocalAI, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+10
-12
@@ -1,10 +1,9 @@
|
||||
import DevicesIcon from '@mui/icons-material/Devices';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { LocalAISourceSetup } from './LocalAISourceSetup';
|
||||
@@ -14,7 +13,7 @@ export interface SourceSetupLocalAI {
|
||||
oaiHost: string; // use OpenAI-compatible non-default hosts (full origin path)
|
||||
}
|
||||
|
||||
export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
|
||||
export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'localai',
|
||||
name: 'LocalAI',
|
||||
rank: 20,
|
||||
@@ -30,7 +29,7 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
|
||||
initializeSetup: () => ({
|
||||
oaiHost: 'http://localhost:8080',
|
||||
}),
|
||||
getAccess: (partialSetup) => ({
|
||||
getTransportAccess: (partialSetup) => ({
|
||||
dialect: 'localai',
|
||||
oaiKey: '',
|
||||
oaiOrg: '',
|
||||
@@ -38,10 +37,9 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
};
|
||||
|
||||
// OpenAI transport ('localai' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
|
||||
@@ -0,0 +1,55 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorMistral } from './mistral.vendor';
|
||||
|
||||
|
||||
const MISTRAL_REG_LINK = 'https://console.mistral.ai/';
|
||||
|
||||
|
||||
export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceSetupValid, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorMistral);
|
||||
|
||||
// derived state
|
||||
const { oaiKey: mistralKey } = access;
|
||||
|
||||
const needsUserKey = !ModelVendorMistral.hasBackendCap?.();
|
||||
const shallFetchSucceed = !needsUserKey || (!!mistralKey && sourceSetupValid);
|
||||
const showKeyError = !!mistralKey && !sourceSetupValid;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorMistral, access, shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
<FormInputKey
|
||||
id='mistral-key' label='Mistral Key'
|
||||
rightLabel={<>{needsUserKey
|
||||
? !mistralKey && <Link level='body-sm' href={MISTRAL_REG_LINK} target='_blank'>request Key</Link>
|
||||
: '✔️ already set in server'}
|
||||
</>}
|
||||
value={mistralKey} onChange={value => updateSetup({ oaiKey: value })}
|
||||
required={needsUserKey} isError={showKeyError}
|
||||
placeholder='...'
|
||||
/>
|
||||
|
||||
<SetupFormRefetchButton
|
||||
refetch={refetch} disabled={/*!shallFetchSucceed ||*/ isFetching} error={isError}
|
||||
/>
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
</>;
|
||||
}
|
||||
@@ -0,0 +1,55 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { MistralIcon } from '~/common/components/icons/MistralIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI, SourceSetupOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { MistralSourceSetup } from './MistralSourceSetup';
|
||||
|
||||
|
||||
// special symbols
|
||||
|
||||
export type SourceSetupMistral = Pick<SourceSetupOpenAI, 'oaiKey' | 'oaiHost'>;
|
||||
|
||||
|
||||
/** Implementation Notes for the Mistral vendor
|
||||
*/
|
||||
export const ModelVendorMistral: IModelVendor<SourceSetupMistral, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'mistral',
|
||||
name: 'Mistral',
|
||||
rank: 15,
|
||||
location: 'cloud',
|
||||
instanceLimit: 1,
|
||||
hasBackendCap: () => backendCaps().hasLlmMistral,
|
||||
|
||||
// components
|
||||
Icon: MistralIcon,
|
||||
SourceSetupComponent: MistralSourceSetup,
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
initializeSetup: () => ({
|
||||
oaiHost: 'https://api.mistral.ai/',
|
||||
oaiKey: '',
|
||||
}),
|
||||
validateSetup: (setup) => {
|
||||
return setup.oaiKey?.length >= 32;
|
||||
},
|
||||
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
dialect: 'mistral',
|
||||
oaiKey: partialSetup?.oaiKey || '',
|
||||
oaiOrg: '',
|
||||
oaiHost: partialSetup?.oaiHost || '',
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
|
||||
// OpenAI transport ('mistral' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
+56
-23
@@ -1,24 +1,29 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Box, Button, Chip, FormControl, Input, Option, Select, Stack, Typography } from '@mui/joy';
|
||||
import { Box, Button, Chip, FormControl, IconButton, Input, Option, Select, Stack, Typography } from '@mui/joy';
|
||||
import LaunchIcon from '@mui/icons-material/Launch';
|
||||
import FormatListNumberedRtlIcon from '@mui/icons-material/FormatListNumberedRtl';
|
||||
|
||||
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
|
||||
import { GoodModal } from '~/common/components/GoodModal';
|
||||
import { GoodTooltip } from '~/common/components/GoodTooltip';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { settingsGap } from '~/common/app.theme';
|
||||
|
||||
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
|
||||
|
||||
|
||||
export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () => void }) {
|
||||
export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {
|
||||
|
||||
// state
|
||||
const [sortByPulls, setSortByPulls] = React.useState<boolean>(false);
|
||||
const [modelName, setModelName] = React.useState<string | null>('llama2');
|
||||
const [modelTag, setModelTag] = React.useState<string>('');
|
||||
|
||||
// external state
|
||||
const { data: pullable } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
|
||||
const { data: pullableData } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
|
||||
staleTime: 1000 * 60,
|
||||
refetchOnWindowFocus: false,
|
||||
});
|
||||
@@ -26,7 +31,11 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
|
||||
const { isLoading: isDeleting, status: deleteStatus, error: deleteError, mutate: deleteMutate, reset: deleteReset } = apiQuery.llmOllama.adminDelete.useMutation();
|
||||
|
||||
// derived state
|
||||
const pullModelDescription = pullable?.pullable.find(p => p.id === modelName)?.description ?? null;
|
||||
let pullable = pullableData?.pullable || [];
|
||||
if (sortByPulls)
|
||||
pullable = pullable.toSorted((a, b) => b.pulls - a.pulls);
|
||||
const pullModelDescription = pullable.find(p => p.id === modelName)?.description ?? null;
|
||||
|
||||
|
||||
const handleModelPull = () => {
|
||||
deleteReset();
|
||||
@@ -38,6 +47,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
|
||||
modelName && deleteMutate({ access: props.access, name: modelName + (modelTag ? ':' + modelTag : '') });
|
||||
};
|
||||
|
||||
|
||||
return (
|
||||
<GoodModal title='Ollama Administration' dividers open onClose={props.onClose}>
|
||||
|
||||
@@ -47,25 +57,48 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
|
||||
However we provide a way to pull models from the Ollama host, for convenience.
|
||||
</Typography>
|
||||
|
||||
<Box sx={{ display: 'flex', gap: 1 }}>
|
||||
<FormControl sx={{ flexGrow: 1 }}>
|
||||
<Box sx={{ display: 'flex', flexFlow: 'row wrap', gap: 1 }}>
|
||||
<FormControl sx={{ flexGrow: 1, flexBasis: 0.55 }}>
|
||||
<FormLabelStart title='Name' />
|
||||
<Select value={modelName || ''} onChange={(_event: any, value: string | null) => setModelName(value)}>
|
||||
{pullable?.pullable.map(p =>
|
||||
<Option key={p.id} value={p.id}>
|
||||
{p.isNew === true && <Chip size='sm' variant='outlined'>New</Chip>} {p.label}
|
||||
</Option>,
|
||||
)}
|
||||
</Select>
|
||||
<Box sx={{ display: 'flex', gap: 1 }}>
|
||||
<Select
|
||||
value={modelName || ''}
|
||||
onChange={(_event: any, value: string | null) => setModelName(value)}
|
||||
sx={{ flexGrow: 1 }}
|
||||
>
|
||||
{pullable.map(p =>
|
||||
<Option key={p.id} value={p.id}>
|
||||
{p.isNew === true && <Chip size='sm' variant='solid'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
|
||||
</Option>,
|
||||
)}
|
||||
</Select>
|
||||
<GoodTooltip title='Sort by Downloads'>
|
||||
<IconButton
|
||||
variant={sortByPulls ? 'solid' : 'outlined'}
|
||||
onClick={() => setSortByPulls(!sortByPulls)}
|
||||
>
|
||||
<FormatListNumberedRtlIcon />
|
||||
</IconButton>
|
||||
</GoodTooltip>
|
||||
</Box>
|
||||
</FormControl>
|
||||
<FormControl sx={{ flexGrow: 1 }}>
|
||||
<FormControl sx={{ flexGrow: 1, flexBasis: 0.45 }}>
|
||||
<FormLabelStart title='Tag' />
|
||||
<Input
|
||||
variant='outlined' placeholder='latest'
|
||||
value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
|
||||
sx={{ minWidth: 100 }}
|
||||
slotProps={{ input: { size: 10 } }} // halve the min width
|
||||
/>
|
||||
<Box sx={{ display: 'flex', gap: 1 }}>
|
||||
<Input
|
||||
variant='outlined' placeholder='latest'
|
||||
value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
|
||||
sx={{ minWidth: 80, flexGrow: 1 }}
|
||||
slotProps={{ input: { size: 10 } }} // halve the min width
|
||||
/>
|
||||
{!!modelName && (
|
||||
<IconButton
|
||||
component={Link} href={`https://ollama.ai/library/${modelName}`} target='_blank'
|
||||
>
|
||||
<LaunchIcon />
|
||||
</IconButton>
|
||||
)}
|
||||
</Box>
|
||||
</FormControl>
|
||||
</Box>
|
||||
|
||||
@@ -85,7 +118,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
|
||||
{pullModelDescription}
|
||||
</Typography>
|
||||
|
||||
<Box sx={{ display: 'flex', gap: 1 }}>
|
||||
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1, alignItems: 'start' }}>
|
||||
<Button
|
||||
variant='outlined'
|
||||
color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
|
||||
+9
-14
@@ -6,13 +6,14 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { asValidURL } from '~/common/util/urlUtils';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorOllama } from './ollama.vendor';
|
||||
import { OllamaAdmin } from './OllamaAdmin';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { OllamaAdministration } from './OllamaAdministration';
|
||||
|
||||
|
||||
export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
@@ -22,7 +23,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorOllama.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorOllama);
|
||||
|
||||
// derived state
|
||||
const { ollamaHost } = access;
|
||||
@@ -32,14 +33,8 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = !hostError;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOllama.listModels.useQuery({ access }, {
|
||||
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOllama, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
|
||||
|
||||
return <>
|
||||
|
||||
@@ -63,7 +58,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
{adminOpen && <OllamaAdmin access={access} onClose={() => setAdminOpen(false)} />}
|
||||
{adminOpen && <OllamaAdministration access={access} onClose={() => setAdminOpen(false)} />}
|
||||
|
||||
</>;
|
||||
}
|
||||
+42
-36
@@ -1,13 +1,14 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { OllamaIcon } from '~/common/components/icons/OllamaIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
|
||||
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import type { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OllamaSourceSetup } from './OllamaSourceSetup';
|
||||
@@ -18,7 +19,7 @@ export interface SourceSetupOllama {
|
||||
}
|
||||
|
||||
|
||||
export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI, OllamaAccessSchema> = {
|
||||
export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'ollama',
|
||||
name: 'Ollama',
|
||||
rank: 22,
|
||||
@@ -32,40 +33,45 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
getAccess: (partialSetup): OllamaAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): OllamaAccessSchema => ({
|
||||
dialect: 'ollama',
|
||||
ollamaHost: partialSetup?.ollamaHost || '',
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return ollamaCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, maxTokens);
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmOllama.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
|
||||
throw new Error('Ollama does not support "Functions" yet');
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Ollama does not support functions');
|
||||
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOllama.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
|
||||
console.error(`ollama.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or throws a descriptive error string
|
||||
*/
|
||||
async function ollamaCallChatGenerate<TOut = VChatMessageOut>(
|
||||
access: OllamaAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOllama.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
|
||||
console.error(`ollamaCallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
|
||||
+6
-12
@@ -6,10 +6,10 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorOoobabooga } from './oobabooga.vendor';
|
||||
|
||||
@@ -18,20 +18,14 @@ export function OobaboogaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceHasLLMs, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorOoobabooga.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorOoobabooga);
|
||||
|
||||
// derived state
|
||||
const { oaiHost } = access;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: false, // !hasModels && !!asValidURL(normSetup.oaiHost),
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOoobabooga, access, false /* !hasModels && !!asValidURL(normSetup.oaiHost) */, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+9
-11
@@ -1,10 +1,9 @@
|
||||
import { OobaboogaIcon } from '~/common/components/icons/OobaboogaIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OobaboogaSourceSetup } from './OobaboogaSourceSetup';
|
||||
@@ -14,7 +13,7 @@ export interface SourceSetupOobabooga {
|
||||
oaiHost: string; // use OpenAI-compatible non-default hosts (full origin path)
|
||||
}
|
||||
|
||||
export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOptionsOpenAI, OpenAIAccessSchema> = {
|
||||
export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'oobabooga',
|
||||
name: 'Oobabooga',
|
||||
rank: 25,
|
||||
@@ -30,7 +29,7 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
|
||||
initializeSetup: (): SourceSetupOobabooga => ({
|
||||
oaiHost: 'http://127.0.0.1:5000',
|
||||
}),
|
||||
getAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
dialect: 'oobabooga',
|
||||
oaiKey: '',
|
||||
oaiOrg: '',
|
||||
@@ -38,10 +37,9 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport (oobabooga dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
+7
-41
@@ -9,13 +9,13 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
|
||||
|
||||
import type { ModelDescriptionSchema } from '../../transports/server/server.schemas';
|
||||
import { DLLM, DModelSource, DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidOpenAIApiKey, LLMOptionsOpenAI, ModelVendorOpenAI } from './openai.vendor';
|
||||
import { isValidOpenAIApiKey, ModelVendorOpenAI } from './openai.vendor';
|
||||
|
||||
|
||||
// avoid repeating it all over
|
||||
@@ -29,7 +29,7 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceHasLLMs, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorOpenAI.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorOpenAI);
|
||||
|
||||
// derived state
|
||||
const { oaiKey, oaiOrg, oaiHost, heliKey, moderationCheck } = access;
|
||||
@@ -40,15 +40,8 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOpenAI, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
@@ -110,30 +103,3 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
</>;
|
||||
}
|
||||
|
||||
|
||||
export function modelDescriptionToDLLM<TSourceSetup>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, LLMOptionsOpenAI> {
|
||||
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
|
||||
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
|
||||
return {
|
||||
id: `${source.id}-${model.id}`,
|
||||
|
||||
label: model.label,
|
||||
created: model.created || 0,
|
||||
updated: model.updated || 0,
|
||||
description: model.description,
|
||||
tags: [], // ['stream', 'chat'],
|
||||
contextTokens: model.contextWindow,
|
||||
maxOutputTokens: maxOutputTokens,
|
||||
hidden: !!model.hidden,
|
||||
|
||||
sId: source.id,
|
||||
_source: source,
|
||||
|
||||
options: {
|
||||
llmRef: model.id,
|
||||
llmTemperature: 0.5,
|
||||
llmResponseTokens: llmResponseTokens,
|
||||
},
|
||||
};
|
||||
}
|
||||
+40
-40
@@ -1,11 +1,12 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { OpenAIIcon } from '~/common/components/icons/OpenAIIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { OpenAILLMOptions } from './OpenAILLMOptions';
|
||||
import { OpenAISourceSetup } from './OpenAISourceSetup';
|
||||
@@ -28,7 +29,7 @@ export interface LLMOptionsOpenAI {
|
||||
llmResponseTokens: number;
|
||||
}
|
||||
|
||||
export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
|
||||
export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'openai',
|
||||
name: 'OpenAI',
|
||||
rank: 10,
|
||||
@@ -42,7 +43,7 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
getAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
dialect: 'openai',
|
||||
oaiKey: '',
|
||||
oaiOrg: '',
|
||||
@@ -51,41 +52,40 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
|
||||
moderationCheck: false,
|
||||
...partialSetup,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
const access = this.getAccess(llm._source.setup);
|
||||
return openAICallChatGenerate(access, llm.options, messages, null, null, maxTokens);
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
const access = this.getAccess(llm._source.setup);
|
||||
return openAICallChatGenerate(access, llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
functions: functions ?? undefined,
|
||||
forceFunctionName: forceFunctionName ?? undefined,
|
||||
history: messages,
|
||||
}) as VChatMessageOrFunctionCallOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
|
||||
console.error(`openai.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or function calls, or throws a descriptive error string
|
||||
*/
|
||||
export async function openAICallChatGenerate<TOut = VChatMessageOut | VChatMessageOrFunctionCallOut>(
|
||||
access: OpenAIAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
functions: functions ?? undefined,
|
||||
forceFunctionName: forceFunctionName ?? undefined,
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
|
||||
console.error(`openAICallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
+40
-21
@@ -1,15 +1,16 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Typography } from '@mui/joy';
|
||||
import { Button, Typography } from '@mui/joy';
|
||||
|
||||
import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { getCallbackUrl } from '~/common/app.routes';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidOpenRouterKey, ModelVendorOpenRouter } from './openrouter.vendor';
|
||||
|
||||
@@ -18,7 +19,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceHasLLMs, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorOpenRouter.getAccess);
|
||||
useSourceSetup(props.sourceId, ModelVendorOpenRouter);
|
||||
|
||||
// derived state
|
||||
const { oaiKey } = access;
|
||||
@@ -29,31 +30,33 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOpenRouter, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
|
||||
const handleOpenRouterLogin = () => {
|
||||
// replace the current page with the OAuth page
|
||||
const callbackUrl = getCallbackUrl('openrouter');
|
||||
const oauthUrl = 'https://openrouter.ai/auth?callback_url=' + encodeURIComponent(callbackUrl);
|
||||
window.open(oauthUrl, '_self');
|
||||
// ...bye / see you soon at the callback location...
|
||||
};
|
||||
|
||||
|
||||
return <>
|
||||
|
||||
{/*<Box sx={{ display: 'flex', gap: 1, alignItems: 'center' }}>*/}
|
||||
{/*<OpenRouterIcon />*/}
|
||||
<Typography level='body-sm'>
|
||||
<Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent, premium service
|
||||
<Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent service
|
||||
granting access to <Link href='https://openrouter.ai/docs#models' target='_blank'>exclusive models</Link> such
|
||||
as GPT-4 32k, Claude, and more, typically unavailable to the public. <Link
|
||||
href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md'>Configuration & documentation</Link>.
|
||||
as GPT-4 32k, Claude, and more. <Link
|
||||
href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md' target='_blank'>
|
||||
Configuration & documentation</Link>.
|
||||
</Typography>
|
||||
{/*</Box>*/}
|
||||
|
||||
<FormInputKey
|
||||
id='openrouter-key' label='OpenRouter API Key'
|
||||
rightLabel={<>{needsUserKey
|
||||
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>create key</Link>
|
||||
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
|
||||
: '✔️ already set in server'
|
||||
} {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
|
||||
</>}
|
||||
@@ -62,7 +65,23 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
placeholder='sk-or-...'
|
||||
/>
|
||||
|
||||
<SetupFormRefetchButton refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError} />
|
||||
<Typography level='body-sm'>
|
||||
🎁 A selection of <Link href='https://openrouter.ai/docs#models' target='_blank'>OpenRouter models</Link> are
|
||||
made available without charge. You can get an API key by using the Login button below.
|
||||
</Typography>
|
||||
|
||||
<SetupFormRefetchButton
|
||||
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
|
||||
leftButton={
|
||||
<Button
|
||||
color='neutral' variant={(needsUserKey && !keyValid) ? 'solid' : 'outlined'}
|
||||
onClick={handleOpenRouterLogin}
|
||||
endDecorator={(needsUserKey && !keyValid) ? '🎁' : undefined}
|
||||
>
|
||||
OpenRouter Login
|
||||
</Button>
|
||||
}
|
||||
/>
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
|
||||
+10
-11
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
|
||||
import { OpenRouterIcon } from '~/common/components/icons/OpenRouterIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OpenRouterSourceSetup } from './OpenRouterSourceSetup';
|
||||
@@ -32,12 +31,13 @@ export interface SourceSetupOpenRouter {
|
||||
* [x] decide whether to do UI work to improve the appearance - prioritized models
|
||||
* [x] works!
|
||||
*/
|
||||
export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptionsOpenAI, OpenAIAccessSchema> = {
|
||||
export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, OpenAIAccessSchema, LLMOptionsOpenAI> = {
|
||||
id: 'openrouter',
|
||||
name: 'OpenRouter',
|
||||
rank: 12,
|
||||
location: 'cloud',
|
||||
instanceLimit: 1,
|
||||
hasFreeModels: true,
|
||||
hasBackendCap: () => backendCaps().hasLlmOpenRouter,
|
||||
|
||||
// components
|
||||
@@ -50,7 +50,7 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
|
||||
oaiHost: 'https://openrouter.ai/api',
|
||||
oaiKey: '',
|
||||
}),
|
||||
getAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
|
||||
dialect: 'openrouter',
|
||||
oaiKey: partialSetup?.oaiKey || '',
|
||||
oaiOrg: '',
|
||||
@@ -58,10 +58,9 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport ('openrouter' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
Vendored
+13
-27
@@ -1,11 +1,10 @@
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
|
||||
import type { DLLM, DLLMId } from '../store-llms';
|
||||
import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
|
||||
import type { ChatStreamingFirstOutputPacketSchema, ChatStreamingInputSchema } from '../server/llm.server.streaming';
|
||||
import type { DLLMId } from '../store-llms';
|
||||
import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';
|
||||
|
||||
import type { ChatStreamFirstPacketSchema, ChatStreamInputSchema } from './server/openai/openai.streaming';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import type { VChatMessageIn } from './chatGenerate';
|
||||
import type { OpenAIWire } from '../server/openai/openai.wiretypes';
|
||||
|
||||
|
||||
/**
|
||||
@@ -15,27 +14,14 @@ import type { VChatMessageIn } from './chatGenerate';
|
||||
* Vendor-specific implementation is on our server backend (API) code. This function tries to be
|
||||
* as generic as possible.
|
||||
*
|
||||
* @param llmId LLM to use
|
||||
* @param messages the history of messages to send to the API endpoint
|
||||
* @param abortSignal used to initiate a client-side abort of the fetch request to the API endpoint
|
||||
* @param onUpdate callback when a piece of a message (text, model name, typing..) is received
|
||||
* NOTE: onUpdate is callback when a piece of a message (text, model name, typing..) is received
|
||||
*/
|
||||
export async function streamChat(
|
||||
export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
access: ChatStreamingInputSchema['access'],
|
||||
llmId: DLLMId,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
): Promise<void> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
const access = vendor.getAccess(llm._source.setup) as ChatStreamInputSchema['access'];
|
||||
return await vendorStreamChat(access, llm, messages, abortSignal, onUpdate);
|
||||
}
|
||||
|
||||
|
||||
async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
access: ChatStreamInputSchema['access'],
|
||||
llm: DLLM<TSourceSetup, TLLMOptions>,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
) {
|
||||
@@ -79,12 +65,12 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
}
|
||||
|
||||
// model params (llm)
|
||||
const { llmRef, llmTemperature, llmResponseTokens } = (llm.options as any) || {};
|
||||
const { llmRef, llmTemperature, llmResponseTokens } = (llmOptions as any) || {};
|
||||
if (!llmRef || llmTemperature === undefined || llmResponseTokens === undefined)
|
||||
throw new Error(`Error in configuration for model ${llm.id}: ${JSON.stringify(llm.options)}`);
|
||||
throw new Error(`Error in configuration for model ${llmId}: ${JSON.stringify(llmOptions)}`);
|
||||
|
||||
// prepare the input, similarly to the tRPC openAI.chatGenerate
|
||||
const input: ChatStreamInputSchema = {
|
||||
const input: ChatStreamingInputSchema = {
|
||||
access,
|
||||
model: {
|
||||
id: llmRef,
|
||||
@@ -131,7 +117,7 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
incrementalText = incrementalText.substring(endOfJson + 1);
|
||||
parsedFirstPacket = true;
|
||||
try {
|
||||
const parsed: ChatStreamFirstPacketSchema = JSON.parse(json);
|
||||
const parsed: ChatStreamingFirstOutputPacketSchema = JSON.parse(json);
|
||||
onUpdate({ originLLM: parsed.model }, false);
|
||||
} catch (e) {
|
||||
// error parsing JSON, ignore
|
||||
+47
@@ -0,0 +1,47 @@
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import type { ModelDescriptionSchema } from '../server/llm.server.types';
|
||||
import { DLLM, DModelSource, useModelsStore } from '../store-llms';
|
||||
|
||||
|
||||
/**
|
||||
* Hook that fetches the list of models from the vendor and updates the store,
|
||||
* while returning the fetch state.
|
||||
*/
|
||||
export function useLlmUpdateModels<TSourceSetup, TAccess, TLLMOptions>(vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>, access: TAccess, enabled: boolean, source: DModelSource<TSourceSetup>) {
|
||||
return vendor.rpcUpdateModelsQuery(access, enabled, data => source && updateModelsFn(data, source));
|
||||
}
|
||||
|
||||
|
||||
function updateModelsFn<TSourceSetup>(data: { models: ModelDescriptionSchema[] }, source: DModelSource<TSourceSetup>) {
|
||||
useModelsStore.getState().setLLMs(
|
||||
data.models.map(model => modelDescriptionToDLLMOpenAIOptions(model, source)),
|
||||
source.id,
|
||||
);
|
||||
}
|
||||
|
||||
function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, TLLMOptions> {
|
||||
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
|
||||
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
|
||||
return {
|
||||
id: `${source.id}-${model.id}`,
|
||||
|
||||
label: model.label,
|
||||
created: model.created || 0,
|
||||
updated: model.updated || 0,
|
||||
description: model.description,
|
||||
tags: [], // ['stream', 'chat'],
|
||||
contextTokens: model.contextWindow,
|
||||
maxOutputTokens: maxOutputTokens,
|
||||
hidden: !!model.hidden,
|
||||
|
||||
sId: source.id,
|
||||
_source: source,
|
||||
|
||||
options: {
|
||||
llmRef: model.id,
|
||||
// @ts-ignore FIXME: large assumption that this is LLMOptionsOpenAI object
|
||||
llmTemperature: 0.5,
|
||||
llmResponseTokens: llmResponseTokens,
|
||||
},
|
||||
};
|
||||
}
|
||||
+35
@@ -0,0 +1,35 @@
|
||||
import { shallow } from 'zustand/shallow';
|
||||
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import { DModelSource, DModelSourceId, useModelsStore } from '../store-llms';
|
||||
|
||||
|
||||
/**
|
||||
* Source-specific read/write - great time saver
|
||||
*/
|
||||
export function useSourceSetup<TSourceSetup, TAccess, TLLMOptions>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>) {
|
||||
|
||||
// invalidates only when the setup changes
|
||||
const { updateSourceSetup, ...rest } = useModelsStore(state => {
|
||||
|
||||
// find the source (or null)
|
||||
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
|
||||
|
||||
// (safe) source-derived properties
|
||||
const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
|
||||
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
|
||||
const access = vendor.getTransportAccess(source?.setup);
|
||||
|
||||
return {
|
||||
source,
|
||||
access,
|
||||
sourceHasLLMs: !!sourceLLMs.length,
|
||||
sourceSetupValid,
|
||||
updateSourceSetup: state.updateSourceSetup,
|
||||
};
|
||||
}, shallow);
|
||||
|
||||
// convenience function for this source
|
||||
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
|
||||
return { ...rest, updateSetup };
|
||||
}
|
||||
+25
-8
@@ -1,24 +1,39 @@
|
||||
import { ModelVendorAnthropic } from './anthropic/anthropic.vendor';
|
||||
import { ModelVendorAzure } from './azure/azure.vendor';
|
||||
import { ModelVendorGemini } from './gemini/gemini.vendor';
|
||||
import { ModelVendorLocalAI } from './localai/localai.vendor';
|
||||
import { ModelVendorMistral } from './mistral/mistral.vendor';
|
||||
import { ModelVendorOllama } from './ollama/ollama.vendor';
|
||||
import { ModelVendorOoobabooga } from './oobabooga/oobabooga.vendor';
|
||||
import { ModelVendorOpenAI } from './openai/openai.vendor';
|
||||
import { ModelVendorOpenRouter } from './openrouter/openrouter.vendor';
|
||||
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import { DLLMId, DModelSource, DModelSourceId, findLLMOrThrow } from '../store-llms';
|
||||
import { IModelVendor, ModelVendorId } from './IModelVendor';
|
||||
|
||||
/** Vendor Instances Registry **/
|
||||
export type ModelVendorId =
|
||||
| 'anthropic'
|
||||
| 'azure'
|
||||
| 'googleai'
|
||||
| 'localai'
|
||||
| 'mistral'
|
||||
| 'ollama'
|
||||
| 'oobabooga'
|
||||
| 'openai'
|
||||
| 'openrouter';
|
||||
|
||||
/** Global: Vendor Instances Registry **/
|
||||
const MODEL_VENDOR_REGISTRY: Record<ModelVendorId, IModelVendor> = {
|
||||
anthropic: ModelVendorAnthropic,
|
||||
azure: ModelVendorAzure,
|
||||
googleai: ModelVendorGemini,
|
||||
localai: ModelVendorLocalAI,
|
||||
mistral: ModelVendorMistral,
|
||||
ollama: ModelVendorOllama,
|
||||
oobabooga: ModelVendorOoobabooga,
|
||||
openai: ModelVendorOpenAI,
|
||||
openrouter: ModelVendorOpenRouter,
|
||||
};
|
||||
} as Record<string, IModelVendor>;
|
||||
|
||||
const MODEL_VENDOR_DEFAULT: ModelVendorId = 'openai';
|
||||
|
||||
@@ -29,13 +44,15 @@ export function findAllVendors(): IModelVendor[] {
|
||||
return modelVendors;
|
||||
}
|
||||
|
||||
export function findVendorById(vendorId?: ModelVendorId): IModelVendor | null {
|
||||
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] ?? null) : null;
|
||||
export function findVendorById<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
vendorId?: ModelVendorId,
|
||||
): IModelVendor<TSourceSetup, TAccess, TLLMOptions> | null {
|
||||
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] as IModelVendor<TSourceSetup, TAccess, TLLMOptions>) ?? null : null;
|
||||
}
|
||||
|
||||
export function findVendorForLlmOrThrow(llmId: DLLMId) {
|
||||
const llm = findLLMOrThrow(llmId);
|
||||
const vendor = findVendorById(llm?._source.vId);
|
||||
export function findVendorForLlmOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(llmId: DLLMId) {
|
||||
const llm = findLLMOrThrow<TSourceSetup, TLLMOptions>(llmId);
|
||||
const vendor = findVendorById<TSourceSetup, TAccess, TLLMOptions>(llm?._source.vId);
|
||||
if (!vendor) throw new Error(`callChat: Vendor not found for LLM ${llmId}`);
|
||||
return { llm, vendor };
|
||||
}
|
||||
@@ -3,9 +3,10 @@ import { createTRPCRouter } from './trpc.server';
|
||||
import { backendRouter } from '~/modules/backend/backend.router';
|
||||
import { elevenlabsRouter } from '~/modules/elevenlabs/elevenlabs.router';
|
||||
import { googleSearchRouter } from '~/modules/google/search.router';
|
||||
import { llmAnthropicRouter } from '~/modules/llms/transports/server/anthropic/anthropic.router';
|
||||
import { llmOllamaRouter } from '~/modules/llms/transports/server/ollama/ollama.router';
|
||||
import { llmOpenAIRouter } from '~/modules/llms/transports/server/openai/openai.router';
|
||||
import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.router';
|
||||
import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
|
||||
import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
|
||||
import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
|
||||
import { prodiaRouter } from '~/modules/prodia/prodia.router';
|
||||
import { ytPersonaRouter } from '../../apps/personas/ytpersona.router';
|
||||
|
||||
@@ -17,6 +18,7 @@ export const appRouterEdge = createTRPCRouter({
|
||||
elevenlabs: elevenlabsRouter,
|
||||
googleSearch: googleSearchRouter,
|
||||
llmAnthropic: llmAnthropicRouter,
|
||||
llmGemini: llmGeminiRouter,
|
||||
llmOllama: llmOllamaRouter,
|
||||
llmOpenAI: llmOpenAIRouter,
|
||||
prodia: prodiaRouter,
|
||||
|
||||
+11
-2
@@ -5,8 +5,8 @@ export const env = createEnv({
|
||||
server: {
|
||||
|
||||
// Backend Postgres, for optional storage via Prisma
|
||||
POSTGRES_PRISMA_URL: z.string().url().optional(),
|
||||
POSTGRES_URL_NON_POOLING: z.string().url().optional(),
|
||||
POSTGRES_PRISMA_URL: z.string().optional(),
|
||||
POSTGRES_URL_NON_POOLING: z.string().optional(),
|
||||
|
||||
// LLM: OpenAI
|
||||
OPENAI_API_KEY: z.string().optional(),
|
||||
@@ -21,6 +21,12 @@ export const env = createEnv({
|
||||
ANTHROPIC_API_KEY: z.string().optional(),
|
||||
ANTHROPIC_API_HOST: z.string().url().optional(),
|
||||
|
||||
// LLM: Google AI's Gemini
|
||||
GEMINI_API_KEY: z.string().optional(),
|
||||
|
||||
// LLM: Mistral
|
||||
MISTRAL_API_KEY: z.string().optional(),
|
||||
|
||||
// LLM: Ollama
|
||||
OLLAMA_API_HOST: z.string().url().optional(),
|
||||
|
||||
@@ -59,6 +65,9 @@ export const env = createEnv({
|
||||
throw new Error('Invalid environment variable');
|
||||
},
|
||||
|
||||
// matches user expectations - see https://github.com/enricoros/big-AGI/issues/279
|
||||
emptyStringAsUndefined: true,
|
||||
|
||||
// with Noext.JS >= 13.4.4 we'd only need to destructure client variables
|
||||
experimental__runtimeEnv: {},
|
||||
});
|
||||
Reference in New Issue
Block a user