mirror of
https://github.com/enricoros/big-AGI.git
synced 2026-05-10 21:50:14 -07:00
Compare commits
56 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 0fc83cf6f5 | |||
| 2949feccd5 | |||
| d6f1c2da81 | |||
| fabb433fde | |||
| b57445eb14 | |||
| 5f8f4aba78 | |||
| d693cdaeba | |||
| 39fbcfd97b | |||
| 7694bc3d52 | |||
| 7f21b2ac3d | |||
| fdb66da1a7 | |||
| 6b62a6733b | |||
| 5d62056807 | |||
| efff7126af | |||
| 45046c70ed | |||
| 7b5b852793 | |||
| 9952b757b8 | |||
| b08ecc9012 | |||
| bc5a38fa89 | |||
| bee49a4b1c | |||
| 0ece1ce58c | |||
| fd897b55b2 | |||
| dd41a402d0 | |||
| 3f9defd18c | |||
| 49c77f5a10 | |||
| 6b2bfa6060 | |||
| 8e3f247bfb | |||
| 201e3a7252 | |||
| 044ed4df79 | |||
| 0df7297cca | |||
| 453a3e5751 | |||
| 34c1c425b9 | |||
| e0a010189f | |||
| 7a07f10ed1 | |||
| 33cb2b84b2 | |||
| 3adec85e1f | |||
| 18cfe5e296 | |||
| 566ba366b4 | |||
| 7ed653b315 | |||
| cb333c33d7 | |||
| 22ba37074b | |||
| 84d7b7644a | |||
| 71445dafc8 | |||
| 66a5ad7f00 | |||
| 09f80adfaa | |||
| 9febd97065 | |||
| 5219f9928d | |||
| aec9f4665f | |||
| db48465204 | |||
| c2c858730a | |||
| 402bde9a81 | |||
| ba1c0ba0d9 | |||
| 084d77cd78 | |||
| 30c17a9b73 | |||
| 2442463da3 | |||
| 84a3e8cfdb |
@@ -65,7 +65,11 @@ I need the following from you:
|
||||
|
||||
### GitHub release
|
||||
|
||||
Now paste the former release (or 1.5.0 which was accurate and great), including the new contributors and
|
||||
```markdown
|
||||
Please create the 1.2.3 Release Notes for GitHub. The following were the Release Notes for 1.1.0. Use a truthful and honest tone, undestanding that people's time and attention span is short. Today is 2023-12-20.
|
||||
```
|
||||
|
||||
Now paste-attachment the former release notes (or 1.5.0 which was accurate and great), including the new contributors and
|
||||
some stats (# of commits, etc.), and roll it for the new release.
|
||||
|
||||
### Discord announcement
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# BIG-AGI 🧠✨
|
||||
|
||||
Welcome to big-AGI 👋, the GPT application for professionals that need function, form,
|
||||
simplicity, and speed. Powered by the latest models from 7 vendors and
|
||||
simplicity, and speed. Powered by the latest models from 8 vendors and
|
||||
open-source model servers, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
|
||||
visualizations, coding, drawing, calling, and quite more -- all in a polished UX.
|
||||
|
||||
@@ -11,7 +11,7 @@ Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
|
||||
|
||||
Or fork & run on Vercel
|
||||
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
|
||||
|
||||
## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2)
|
||||
|
||||
@@ -21,7 +21,19 @@ shows the current developments and future ideas.
|
||||
- Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
|
||||
- Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_
|
||||
|
||||
### What's New in 1.7.3 · Dec 13, 2023 · Attachment Theory 🌟
|
||||
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
|
||||
|
||||
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
|
||||
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
|
||||
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- Mac Shortcuts Fix: Improved UX on Mac
|
||||
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
|
||||
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
|
||||
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
|
||||
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
|
||||
|
||||
### What's New in 1.7.0 · Dec 11, 2023
|
||||
|
||||
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
|
||||
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
|
||||
@@ -31,9 +43,6 @@ shows the current developments and future ideas.
|
||||
- Optimized Voice Input and Performance
|
||||
- Latest Ollama and Oobabooga models
|
||||
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
|
||||
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- [1.7.2]: OpenRouter login & free models 🎁
|
||||
- [1.7.3]: Mistral Platform support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
|
||||
### What's New in 1.6.0 - Nov 28, 2023
|
||||
|
||||
@@ -148,7 +157,7 @@ Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare
|
||||
|
||||
Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.
|
||||
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
|
||||
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
|
||||
|
||||
## Integrations:
|
||||
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
export const runtime = 'edge';
|
||||
export { openaiStreamingRelayHandler as POST } from '~/modules/llms/transports/server/openai/openai.streaming';
|
||||
export { llmStreamingRelayHandler as POST } from '~/modules/llms/server/llm.server.streaming';
|
||||
+1
-1
@@ -6,7 +6,7 @@ version: '3.9'
|
||||
|
||||
services:
|
||||
big-agi:
|
||||
image: ghcr.io/enricoros/big-agi:main
|
||||
image: ghcr.io/enricoros/big-agi:latest
|
||||
ports:
|
||||
- "3000:3000"
|
||||
env_file:
|
||||
|
||||
+15
-6
@@ -5,12 +5,24 @@ by release.
|
||||
|
||||
- For the live roadmap, please see [the GitHub project](https://github.com/users/enricoros/projects/4/views/2)
|
||||
|
||||
### 1.8.0 - Dec 2023
|
||||
### 1.9.0 - Dec 2023
|
||||
|
||||
- work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
|
||||
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
|
||||
- milestone: [1.9.0](https://github.com/enricoros/big-agi/milestone/9)
|
||||
|
||||
### What's New in 1.7.3 · Dec 13, 2023 · Attachment Theory 🌟
|
||||
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
|
||||
|
||||
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
|
||||
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
|
||||
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- Mac Shortcuts Fix: Improved UX on Mac
|
||||
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
|
||||
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
|
||||
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
|
||||
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
|
||||
|
||||
### What's New in 1.7.0 · Dec 11, 2023 · Attachment Theory
|
||||
|
||||
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
|
||||
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
|
||||
@@ -20,9 +32,6 @@ by release.
|
||||
- Optimized Voice Input and Performance
|
||||
- Latest Ollama and Oobabooga models
|
||||
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
|
||||
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
|
||||
- [1.7.2]: OpenRouter login & free models 🎁
|
||||
- [1.7.3]: Mistral Platform support. [#273](https://github.com/enricoros/big-agi/issues/273)
|
||||
|
||||
### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up
|
||||
|
||||
|
||||
@@ -30,5 +30,5 @@ For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/ba
|
||||
|
||||
> NOTE: LocalAI does not list details about the mdoels. Every model is assumed to be
|
||||
> capable of chatting, and with a context window of 4096 tokens.
|
||||
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/transports/server/openai/models.data.ts)
|
||||
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/server/openai/models.data.ts)
|
||||
> file with the mapping information between LocalAI model IDs and names/descriptions/tokens, etc.
|
||||
|
||||
+24
-12
@@ -5,13 +5,15 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
|
||||
experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
|
||||
editing tools, models switching, personas, and more.
|
||||
|
||||
_Last updated Dec 11, 2023_
|
||||
_Last updated Dec 16, 2023_
|
||||
|
||||

|
||||
|
||||
## Quick Integration Guide
|
||||
|
||||
1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
|
||||
- For detailed instructions on setting up the Ollama API server, please refer to the
|
||||
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
|
||||
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
|
||||
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
|
||||
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
|
||||
@@ -20,21 +22,29 @@ _Last updated Dec 11, 2023_
|
||||
you'll have to press the 'Pull' button again, until a green message appears.
|
||||
5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas
|
||||
|
||||
### Ollama: installation and Setup
|
||||
**Visual Configuration Guide**:
|
||||
|
||||
For detailed instructions on setting up the Ollama API server, please refer to the
|
||||
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
|
||||
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:<br/>
|
||||
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" width="320">
|
||||
|
||||
### Visual Guide
|
||||
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:<br/>
|
||||
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" width="320">
|
||||
|
||||
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:
|
||||
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" style="max-width: 320px;">
|
||||
* You can now switch model/persona dynamically and text/voice chat with the models:<br/>
|
||||
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" width="320">
|
||||
|
||||
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:
|
||||
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" style="max-width: 320px;">
|
||||
<br/>
|
||||
|
||||
* You can now switch model/persona dynamically and text/voice chat with the models:
|
||||
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" style="max-width: 320px;">
|
||||
### ⚠️ Network Troubleshooting
|
||||
|
||||
If you get errors about the server having trouble connecting with Ollama, please see
|
||||
[this message](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483) on Issue #276.
|
||||
|
||||
And in brief, make sure the Ollama endpoint is accessible from the servers where you run big-AGI (which could
|
||||
be localhost or cloud servers).
|
||||

|
||||
|
||||
<br/>
|
||||
|
||||
### Advanced: Model parameters
|
||||
|
||||
@@ -73,6 +83,8 @@ Then, edit the nginx configuration file `/etc/nginx/sites-enabled/default` and a
|
||||
|
||||
Reach out to our community if you need help with this.
|
||||
|
||||
<br/>
|
||||
|
||||
### Community and Support
|
||||
|
||||
Join our community to share your experiences, get help, and discuss best practices:
|
||||
@@ -83,4 +95,4 @@ Join our community to share your experiences, get help, and discuss best practic
|
||||
---
|
||||
|
||||
`big-AGI` is committed to providing a powerful, intuitive, and privacy-respecting AI experience.
|
||||
We are excited for you to explore the possibilities with Ollama models. Happy creating!
|
||||
We are excited for you to explore the possibilities with Ollama models. Happy creating!
|
||||
|
||||
+37
-20
@@ -21,33 +21,23 @@ Docker ensures faster development cycles, easier collaboration, and seamless env
|
||||
```
|
||||
4. Browse to [http://localhost:3000](http://localhost:3000)
|
||||
|
||||
## Documentation
|
||||
<br/>
|
||||
|
||||
The big-AGI repository includes a Dockerfile and a GitHub Actions workflow for building and publishing a
|
||||
Docker image of the application.
|
||||
## Run Official Containers 📦
|
||||
|
||||
### Dockerfile
|
||||
`big-AGI` is pre-built from source code and published as a Docker image on the GitHub Container Registry (ghcr).
|
||||
The build process is transparent, and happens via GitHub Actions, as described in the
|
||||
file.
|
||||
|
||||
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
|
||||
installs dependencies, and creates a production-ready version of the application as a local container.
|
||||
### Official Images: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
|
||||
|
||||
### Official container images
|
||||
#### Run using *docker* 🚀
|
||||
|
||||
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file automates the
|
||||
building and publishing of the Docker images to the GitHub Container Registry (ghcr) when changes are
|
||||
pushed to the `main` branch.
|
||||
|
||||
Official pre-built containers: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
|
||||
|
||||
Run official pre-built containers:
|
||||
```bash
|
||||
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
|
||||
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi:latest
|
||||
```
|
||||
|
||||
### Run official containers
|
||||
|
||||
In addition, the repository also includes a `docker-compose.yaml` file, configured to run the pre-built
|
||||
'ghcr image'. This file is used to define the `big-agi` service, the ports to expose, and the command to run.
|
||||
#### Run using *docker-compose* 🚀
|
||||
|
||||
If you have Docker Compose installed, you can run the Docker container with `docker-compose up`
|
||||
to pull the Docker image (if it hasn't been pulled already) and start a Docker container. If you want to
|
||||
@@ -57,4 +47,31 @@ update the image to the latest version, you can run `docker-compose pull` before
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment.
|
||||
### Make Local Services Visible to Docker 🌐
|
||||
|
||||
To make local services running on your host machine accessible to a Docker container, such as a
|
||||
[Browseless](./config-browse.md) service or a local API, you can follow this simplified guide:
|
||||
|
||||
| Operating System | Steps to Make Local Services Visible to Docker |
|
||||
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| Windows and macOS | Use the special DNS name `host.docker.internal` to refer to the host machine from within the Docker container. No additional network configuration is required. Access local services using `host.docker.internal:<PORT>`. |
|
||||
| Linux | Two options: *A*. Use <ins>--network="host"</ins> (`docker run --network="host" -d big-agi`) when running the Docker container to merge the container within the host network stack; however, this reduces container isolation. Alternatively: *B*. Connect to local services <ins>using the host's IP address</ins> directly, as host.docker.internal is not available by default on Linux. |
|
||||
|
||||
<br/>
|
||||
|
||||
### More Information
|
||||
|
||||
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
|
||||
installs dependencies, and creates a production-ready version of the application as a local container.
|
||||
|
||||
The [`docker-compose.yaml`](../docker-compose.yaml) file is configured to run the
|
||||
official image (big-agi:latest). This file is used to define the `big-agi` service, to expose
|
||||
port 3000 on the host, and launch big-AGI within the container (startup command).
|
||||
|
||||
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file is used
|
||||
to build the Official Docker images and publish them to the GitHub Container Registry (ghcr).
|
||||
The build process is transparent and happens via GitHub Actions.
|
||||
|
||||
<br/>
|
||||
|
||||
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment!
|
||||
@@ -12,7 +12,7 @@ version: '3.9'
|
||||
|
||||
services:
|
||||
big-agi:
|
||||
image: ghcr.io/enricoros/big-agi:main
|
||||
image: ghcr.io/enricoros/big-agi:latest
|
||||
ports:
|
||||
- "3000:3000"
|
||||
env_file:
|
||||
|
||||
@@ -24,6 +24,7 @@ AZURE_OPENAI_API_ENDPOINT=
|
||||
AZURE_OPENAI_API_KEY=
|
||||
ANTHROPIC_API_KEY=
|
||||
ANTHROPIC_API_HOST=
|
||||
GEMINI_API_KEY=
|
||||
MISTRAL_API_KEY=
|
||||
OLLAMA_API_HOST=
|
||||
OPENROUTER_API_KEY=
|
||||
@@ -46,7 +47,7 @@ PUPPETEER_WSS_ENDPOINT=
|
||||
# Backend Analytics
|
||||
BACKEND_ANALYTICS=
|
||||
|
||||
# Backend HTTP Basic Authentication
|
||||
# Backend HTTP Basic Authentication (see `deploy-authentication.md` for turning on authentication)
|
||||
HTTP_BASIC_AUTH_USERNAME=
|
||||
HTTP_BASIC_AUTH_PASSWORD=
|
||||
```
|
||||
@@ -80,6 +81,7 @@ requiring the user to enter an API key
|
||||
| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md) | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
|
||||
| `ANTHROPIC_API_KEY` | The API key for Anthropic | Optional |
|
||||
| `ANTHROPIC_API_HOST` | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional |
|
||||
| `GEMINI_API_KEY` | The API key for Google AI's Gemini | Optional |
|
||||
| `MISTRAL_API_KEY` | The API key for Mistral | Optional |
|
||||
| `OLLAMA_API_HOST` | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md) | |
|
||||
| `OPENROUTER_API_KEY` | The API key for OpenRouter | Optional |
|
||||
@@ -115,10 +117,7 @@ Enable the app to Talk, Draw, and Google things up.
|
||||
| `PUPPETEER_WSS_ENDPOINT` | Puppeteer WebSocket endpoint - used for browsing, etc. |
|
||||
| **Backend** | |
|
||||
| `BACKEND_ANALYTICS` | Semicolon-separated list of analytics flags (see backend.analytics.ts). Flags: `domain` logs the responding domain. |
|
||||
| `HTTP_BASIC_AUTH_USERNAME` | Username for HTTP Basic Authentication. See the [Authentication](deploy-authentication.md) guide. |
|
||||
| `HTTP_BASIC_AUTH_USERNAME` | See the [Authentication](deploy-authentication.md) guide. Username for HTTP Basic Authentication. |
|
||||
| `HTTP_BASIC_AUTH_PASSWORD` | Password for HTTP Basic Authentication. |
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 79 KiB |
Generated
+72
-68
@@ -1,12 +1,12 @@
|
||||
{
|
||||
"name": "big-agi",
|
||||
"version": "1.7.3",
|
||||
"version": "1.8.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "big-agi",
|
||||
"version": "1.7.3",
|
||||
"version": "1.8.0",
|
||||
"hasInstallScript": true,
|
||||
"dependencies": {
|
||||
"@dqbd/tiktoken": "^1.0.7",
|
||||
@@ -14,13 +14,13 @@
|
||||
"@emotion/react": "^11.11.1",
|
||||
"@emotion/server": "^11.11.0",
|
||||
"@emotion/styled": "^11.11.0",
|
||||
"@mui/icons-material": "^5.14.19",
|
||||
"@mui/joy": "^5.0.0-beta.17",
|
||||
"@mui/icons-material": "^5.15.0",
|
||||
"@mui/joy": "^5.0.0-beta.18",
|
||||
"@next/bundle-analyzer": "^14.0.4",
|
||||
"@prisma/client": "^5.7.0",
|
||||
"@sanity/diff-match-patch": "^3.1.1",
|
||||
"@t3-oss/env-nextjs": "^0.7.1",
|
||||
"@tanstack/react-query": "^4.36.1",
|
||||
"@tanstack/react-query": "~4.36.1",
|
||||
"@trpc/client": "^10.44.1",
|
||||
"@trpc/next": "^10.44.1",
|
||||
"@trpc/react-query": "^10.44.1",
|
||||
@@ -43,14 +43,14 @@
|
||||
"tesseract.js": "^5.0.3",
|
||||
"uuid": "^9.0.1",
|
||||
"zod": "^3.22.4",
|
||||
"zustand": "~4.3.9"
|
||||
"zustand": "^4.4.7"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@cloudflare/puppeteer": "^0.0.5",
|
||||
"@types/node": "^20.10.4",
|
||||
"@types/plantuml-encoder": "^1.4.2",
|
||||
"@types/prismjs": "^1.26.3",
|
||||
"@types/react": "^18.2.43",
|
||||
"@types/react": "^18.2.45",
|
||||
"@types/react-dom": "^18.2.17",
|
||||
"@types/react-katex": "^3.0.4",
|
||||
"@types/react-timeago": "^4.1.6",
|
||||
@@ -596,14 +596,14 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/base": {
|
||||
"version": "5.0.0-beta.26",
|
||||
"resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.26.tgz",
|
||||
"integrity": "sha512-gPMRKC84VRw+tjqYoyBzyrBUqHQucMXdlBpYazHa5rCXrb91fYEQk5SqQ2U5kjxx9QxZxTBvWAmZ6DblIgaGhQ==",
|
||||
"version": "5.0.0-beta.27",
|
||||
"resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.27.tgz",
|
||||
"integrity": "sha512-duL37qxihT1N0pW/gyXVezP7SttLkF+cLAs/y6g6ubEFmVadjbnZ45SeF12/vAiKzqwf5M0uFH1cczIPXFZygA==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@floating-ui/react-dom": "^2.0.4",
|
||||
"@mui/types": "^7.2.10",
|
||||
"@mui/utils": "^5.14.20",
|
||||
"@mui/types": "^7.2.11",
|
||||
"@mui/utils": "^5.15.0",
|
||||
"@popperjs/core": "^2.11.8",
|
||||
"clsx": "^2.0.0",
|
||||
"prop-types": "^15.8.1"
|
||||
@@ -627,20 +627,20 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/core-downloads-tracker": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/core-downloads-tracker/-/core-downloads-tracker-5.14.20.tgz",
|
||||
"integrity": "sha512-fXoGe8VOrIYajqALysFuyal1q1YmBARqJ3tmnWYDVl0scu8f6h6tZQbS2K8BY28QwkWNGyv4WRfuUkzN5HR3Ow==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/core-downloads-tracker/-/core-downloads-tracker-5.15.0.tgz",
|
||||
"integrity": "sha512-NpGtlHwuyLfJtdrlERXb8qRqd279O0VnuGaZAor1ehdNhUJOD1bSxHDeXKZkbqNpvi50hasFj7lsbTpluworTQ==",
|
||||
"funding": {
|
||||
"type": "opencollective",
|
||||
"url": "https://opencollective.com/mui-org"
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/icons-material": {
|
||||
"version": "5.14.19",
|
||||
"resolved": "https://registry.npmjs.org/@mui/icons-material/-/icons-material-5.14.19.tgz",
|
||||
"integrity": "sha512-yjP8nluXxZGe3Y7pS+yxBV+hWZSsSBampCxkZwaw+1l+feL+rfP74vbEFbMrX/Kil9I/Y1tWfy5bs/eNvwNpWw==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/icons-material/-/icons-material-5.15.0.tgz",
|
||||
"integrity": "sha512-zHY6fOkaK7VfhWeyxO8MjO3IAjEYpYMXuqUhX7TkUZJ9+TSH/9dn4ClG4K2j6hdgBU5Yrq2Z/89Bo6BHHp7AdQ==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4"
|
||||
"@babel/runtime": "^7.23.5"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=12.0.0"
|
||||
@@ -661,16 +661,16 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/joy": {
|
||||
"version": "5.0.0-beta.17",
|
||||
"resolved": "https://registry.npmjs.org/@mui/joy/-/joy-5.0.0-beta.17.tgz",
|
||||
"integrity": "sha512-KQMfQe7P98jRYWcjTxLRnjAlWre0YGvZstpE+xNJyOn6aTnMomnAskMIG0s2+k5PcluyxTEZZKZZ0Usl3M5D6g==",
|
||||
"version": "5.0.0-beta.18",
|
||||
"resolved": "https://registry.npmjs.org/@mui/joy/-/joy-5.0.0-beta.18.tgz",
|
||||
"integrity": "sha512-TxEo7kqEnbjB5S8cyFrytWjzhxW12UxkEJOT0QM8WpwaBN3Ie1okFuo2bnFW94vYFZperW97/H/08cqqS/2JPA==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@mui/base": "5.0.0-beta.26",
|
||||
"@mui/core-downloads-tracker": "^5.14.20",
|
||||
"@mui/system": "^5.14.20",
|
||||
"@mui/types": "^7.2.10",
|
||||
"@mui/utils": "^5.14.20",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@mui/base": "5.0.0-beta.27",
|
||||
"@mui/core-downloads-tracker": "^5.15.0",
|
||||
"@mui/system": "^5.15.0",
|
||||
"@mui/types": "^7.2.11",
|
||||
"@mui/utils": "^5.15.0",
|
||||
"clsx": "^2.0.0",
|
||||
"prop-types": "^15.8.1"
|
||||
},
|
||||
@@ -701,17 +701,17 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/material": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/material/-/material-5.14.20.tgz",
|
||||
"integrity": "sha512-SUcPZnN6e0h1AtrDktEl76Dsyo/7pyEUQ+SAVe9XhHg/iliA0b4Vo+Eg4HbNkELsMbpDsUF4WHp7rgflPG7qYQ==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/material/-/material-5.15.0.tgz",
|
||||
"integrity": "sha512-60CDI/hQNwJv9a3vEZtFG7zz0USdQhVwpBd3fZqrzhuXSdiMdYMaZcCXeX/KMuNq0ZxQEAZd74Pv+gOb408QVA==",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@mui/base": "5.0.0-beta.26",
|
||||
"@mui/core-downloads-tracker": "^5.14.20",
|
||||
"@mui/system": "^5.14.20",
|
||||
"@mui/types": "^7.2.10",
|
||||
"@mui/utils": "^5.14.20",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@mui/base": "5.0.0-beta.27",
|
||||
"@mui/core-downloads-tracker": "^5.15.0",
|
||||
"@mui/system": "^5.15.0",
|
||||
"@mui/types": "^7.2.11",
|
||||
"@mui/utils": "^5.15.0",
|
||||
"@types/react-transition-group": "^4.4.9",
|
||||
"clsx": "^2.0.0",
|
||||
"csstype": "^3.1.2",
|
||||
@@ -746,12 +746,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/private-theming": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/private-theming/-/private-theming-5.14.20.tgz",
|
||||
"integrity": "sha512-WV560e1vhs2IHCh0pgUaWHznrcrVoW9+cDCahU1VTkuwPokWVvb71ccWQ1f8Y3tRBPPcNkU2dChkkRJChLmQlQ==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/private-theming/-/private-theming-5.15.0.tgz",
|
||||
"integrity": "sha512-7WxtIhXxNek0JjtsYy+ut2LtFSLpsUW5JSDehQO+jF7itJ8ehy7Bd9bSt2yIllbwGjCFowLfYpPk2Ykgvqm1tA==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@mui/utils": "^5.14.20",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@mui/utils": "^5.15.0",
|
||||
"prop-types": "^15.8.1"
|
||||
},
|
||||
"engines": {
|
||||
@@ -772,11 +772,11 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/styled-engine": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/styled-engine/-/styled-engine-5.14.20.tgz",
|
||||
"integrity": "sha512-Vs4nGptd9wRslo9zeRkuWcZeIEp+oYbODy+fiZKqqr4CH1Gfi9fdP0Q1tGYk8OiJ2EPB/tZSAyOy62Hyp/iP7g==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/styled-engine/-/styled-engine-5.15.0.tgz",
|
||||
"integrity": "sha512-6NysIsHkuUS2lF+Lzv1jiK3UjBJk854/vKVcJQVGKlPiqNEVZJNlwaSpsaU5xYXxWEZYfbVFSAomLOS/LV/ovQ==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@emotion/cache": "^11.11.0",
|
||||
"csstype": "^3.1.2",
|
||||
"prop-types": "^15.8.1"
|
||||
@@ -803,15 +803,15 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/system": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/system/-/system-5.14.20.tgz",
|
||||
"integrity": "sha512-jKOGtK4VfYZG5kdaryUHss4X6hzcfh0AihT8gmnkfqRtWP7xjY+vPaUhhuSeibE5sqA5wCtdY75z6ep9pxFnIg==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/system/-/system-5.15.0.tgz",
|
||||
"integrity": "sha512-8TPjfTlYBNB7/zBJRL4QOD9kImwdZObbiYNh0+hxvhXr2koezGx8USwPXj8y/JynbzGCkIybkUztCdWlMZe6OQ==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@mui/private-theming": "^5.14.20",
|
||||
"@mui/styled-engine": "^5.14.19",
|
||||
"@mui/types": "^7.2.10",
|
||||
"@mui/utils": "^5.14.20",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@mui/private-theming": "^5.15.0",
|
||||
"@mui/styled-engine": "^5.15.0",
|
||||
"@mui/types": "^7.2.11",
|
||||
"@mui/utils": "^5.15.0",
|
||||
"clsx": "^2.0.0",
|
||||
"csstype": "^3.1.2",
|
||||
"prop-types": "^15.8.1"
|
||||
@@ -842,9 +842,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/types": {
|
||||
"version": "7.2.10",
|
||||
"resolved": "https://registry.npmjs.org/@mui/types/-/types-7.2.10.tgz",
|
||||
"integrity": "sha512-wX1vbDC+lzF7FlhT6A3ffRZgEoKWPF8VqRoTu4lZwouFX2t90KyCMsgepMw5DxLak1BSp/KP86CmtZttikb/gQ==",
|
||||
"version": "7.2.11",
|
||||
"resolved": "https://registry.npmjs.org/@mui/types/-/types-7.2.11.tgz",
|
||||
"integrity": "sha512-KWe/QTEsFFlFSH+qRYf3zoFEj3z67s+qAuSnMMg+gFwbxG7P96Hm6g300inQL1Wy///gSRb8juX7Wafvp93m3w==",
|
||||
"peerDependencies": {
|
||||
"@types/react": "^17.0.0 || ^18.0.0"
|
||||
},
|
||||
@@ -855,11 +855,11 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@mui/utils": {
|
||||
"version": "5.14.20",
|
||||
"resolved": "https://registry.npmjs.org/@mui/utils/-/utils-5.14.20.tgz",
|
||||
"integrity": "sha512-Y6yL5MoFmtQml20DZnaaK1znrCEwG6/vRSzW8PKOTrzhyqKIql0FazZRUR7sA5EPASgiyKZfq0FPwISRXm5NdA==",
|
||||
"version": "5.15.0",
|
||||
"resolved": "https://registry.npmjs.org/@mui/utils/-/utils-5.15.0.tgz",
|
||||
"integrity": "sha512-XSmTKStpKYamewxyJ256+srwEnsT3/6eNo6G7+WC1tj2Iq9GfUJ/6yUoB7YXjOD2jTZ3XobToZm4pVz1LBt6GA==",
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.23.4",
|
||||
"@babel/runtime": "^7.23.5",
|
||||
"@types/prop-types": "^15.7.11",
|
||||
"prop-types": "^15.8.1",
|
||||
"react-is": "^18.2.0"
|
||||
@@ -1377,9 +1377,9 @@
|
||||
"integrity": "sha512-ga8y9v9uyeiLdpKddhxYQkxNDrfvuPrlFb0N1qnZZByvcElJaXthF1UhvCh9TLWJBEHeNtdnbysW7Y6Uq8CVng=="
|
||||
},
|
||||
"node_modules/@types/react": {
|
||||
"version": "18.2.43",
|
||||
"resolved": "https://registry.npmjs.org/@types/react/-/react-18.2.43.tgz",
|
||||
"integrity": "sha512-nvOV01ZdBdd/KW6FahSbcNplt2jCJfyWdTos61RYHV+FVv5L/g9AOX1bmbVcWcLFL8+KHQfh1zVIQrud6ihyQA==",
|
||||
"version": "18.2.45",
|
||||
"resolved": "https://registry.npmjs.org/@types/react/-/react-18.2.45.tgz",
|
||||
"integrity": "sha512-TtAxCNrlrBp8GoeEp1npd5g+d/OejJHFxS3OWmrPBMFaVQMSN0OFySozJio5BHxTuTeug00AVXVAjfDSfk+lUg==",
|
||||
"dependencies": {
|
||||
"@types/prop-types": "*",
|
||||
"@types/scheduler": "*",
|
||||
@@ -7263,9 +7263,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/zustand": {
|
||||
"version": "4.3.9",
|
||||
"resolved": "https://registry.npmjs.org/zustand/-/zustand-4.3.9.tgz",
|
||||
"integrity": "sha512-Tat5r8jOMG1Vcsj8uldMyqYKC5IZvQif8zetmLHs9WoZlntTHmIoNM8TpLRY31ExncuUvUOXehd0kvahkuHjDw==",
|
||||
"version": "4.4.7",
|
||||
"resolved": "https://registry.npmjs.org/zustand/-/zustand-4.4.7.tgz",
|
||||
"integrity": "sha512-QFJWJMdlETcI69paJwhSMJz7PPWjVP8Sjhclxmxmxv/RYI7ZOvR5BHX+ktH0we9gTWQMxcne8q1OY8xxz604gw==",
|
||||
"dependencies": {
|
||||
"use-sync-external-store": "1.2.0"
|
||||
},
|
||||
@@ -7273,10 +7273,14 @@
|
||||
"node": ">=12.7.0"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": ">=16.8",
|
||||
"immer": ">=9.0",
|
||||
"react": ">=16.8"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"immer": {
|
||||
"optional": true
|
||||
},
|
||||
|
||||
+6
-6
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "big-agi",
|
||||
"version": "1.7.3",
|
||||
"version": "1.8.0",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"dev": "next dev",
|
||||
@@ -18,13 +18,13 @@
|
||||
"@emotion/react": "^11.11.1",
|
||||
"@emotion/server": "^11.11.0",
|
||||
"@emotion/styled": "^11.11.0",
|
||||
"@mui/icons-material": "^5.14.19",
|
||||
"@mui/joy": "^5.0.0-beta.17",
|
||||
"@mui/icons-material": "^5.15.0",
|
||||
"@mui/joy": "^5.0.0-beta.18",
|
||||
"@next/bundle-analyzer": "^14.0.4",
|
||||
"@prisma/client": "^5.7.0",
|
||||
"@sanity/diff-match-patch": "^3.1.1",
|
||||
"@t3-oss/env-nextjs": "^0.7.1",
|
||||
"@tanstack/react-query": "^4.36.1",
|
||||
"@tanstack/react-query": "~4.36.1",
|
||||
"@trpc/client": "^10.44.1",
|
||||
"@trpc/next": "^10.44.1",
|
||||
"@trpc/react-query": "^10.44.1",
|
||||
@@ -47,14 +47,14 @@
|
||||
"tesseract.js": "^5.0.3",
|
||||
"uuid": "^9.0.1",
|
||||
"zod": "^3.22.4",
|
||||
"zustand": "~4.3.9"
|
||||
"zustand": "^4.4.7"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@cloudflare/puppeteer": "^0.0.5",
|
||||
"@types/node": "^20.10.4",
|
||||
"@types/plantuml-encoder": "^1.4.2",
|
||||
"@types/prismjs": "^1.26.3",
|
||||
"@types/react": "^18.2.43",
|
||||
"@types/react": "^18.2.45",
|
||||
"@types/react-dom": "^18.2.17",
|
||||
"@types/react-katex": "^3.0.4",
|
||||
"@types/react-timeago": "^4.1.6",
|
||||
|
||||
+10
-7
@@ -11,6 +11,7 @@ import '~/common/styles/CodePrism.css';
|
||||
import '~/common/styles/GithubMarkdown.css';
|
||||
|
||||
import { ProviderBackend } from '~/common/state/ProviderBackend';
|
||||
import { ProviderSingleTab } from '~/common/state/ProviderSingleTab';
|
||||
import { ProviderSnacks } from '~/common/state/ProviderSnacks';
|
||||
import { ProviderTRPCQueryClient } from '~/common/state/ProviderTRPCQueryClient';
|
||||
import { ProviderTheming } from '~/common/state/ProviderTheming';
|
||||
@@ -25,13 +26,15 @@ const MyApp = ({ Component, emotionCache, pageProps }: MyAppProps) =>
|
||||
</Head>
|
||||
|
||||
<ProviderTheming emotionCache={emotionCache}>
|
||||
<ProviderTRPCQueryClient>
|
||||
<ProviderSnacks>
|
||||
<ProviderBackend>
|
||||
<Component {...pageProps} />
|
||||
</ProviderBackend>
|
||||
</ProviderSnacks>
|
||||
</ProviderTRPCQueryClient>
|
||||
<ProviderSingleTab>
|
||||
<ProviderTRPCQueryClient>
|
||||
<ProviderSnacks>
|
||||
<ProviderBackend>
|
||||
<Component {...pageProps} />
|
||||
</ProviderBackend>
|
||||
</ProviderSnacks>
|
||||
</ProviderTRPCQueryClient>
|
||||
</ProviderSingleTab>
|
||||
</ProviderTheming>
|
||||
|
||||
<VercelAnalytics debug={false} />
|
||||
|
||||
@@ -15,8 +15,7 @@ import { useChatLLMDropdown } from '../chat/components/applayout/useLLMDropdown'
|
||||
|
||||
import { EXPERIMENTAL_speakTextStream } from '~/modules/elevenlabs/elevenlabs.client';
|
||||
import { SystemPurposeId, SystemPurposes } from '../../data';
|
||||
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
import { useElevenLabsVoiceDropdown } from '~/modules/elevenlabs/useElevenLabsVoiceDropdown';
|
||||
|
||||
import { Link } from '~/common/components/Link';
|
||||
@@ -216,7 +215,7 @@ export function CallUI(props: {
|
||||
responseAbortController.current = new AbortController();
|
||||
let finalText = '';
|
||||
let error: any | null = null;
|
||||
streamChat(chatLLMId, callPrompt, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
|
||||
llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
|
||||
const text = updatedMessage.text?.trim();
|
||||
if (text) {
|
||||
finalText = text;
|
||||
|
||||
@@ -3,7 +3,7 @@ import * as React from 'react';
|
||||
import { Chip, ColorPaletteProp, VariantProp } from '@mui/joy';
|
||||
import { SxProps } from '@mui/joy/styles/types';
|
||||
|
||||
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import type { VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export function CallMessage(props: {
|
||||
|
||||
@@ -167,6 +167,8 @@ function explainErrorInMessage(text: string, isAssistant: boolean, modelId?: str
|
||||
make sure the usage is under <Link noLinkStyle href='https://platform.openai.com/account/billing/limits' target='_blank'>the limits</Link>.
|
||||
</>;
|
||||
}
|
||||
// else
|
||||
// errorMessage = <>{text || 'Unknown error'}</>;
|
||||
|
||||
return { errorMessage, isAssistantError };
|
||||
}
|
||||
|
||||
@@ -2,8 +2,8 @@ import { DLLMId } from '~/modules/llms/store-llms';
|
||||
import { SystemPurposeId } from '../../../data';
|
||||
import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
|
||||
import { autoTitle } from '~/modules/aifn/autotitle/autoTitle';
|
||||
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
|
||||
import { speakText } from '~/modules/elevenlabs/elevenlabs.client';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
|
||||
import { DMessage, useChatStore } from '~/common/state/store-chats';
|
||||
|
||||
@@ -63,7 +63,7 @@ async function streamAssistantMessage(
|
||||
const messages = history.map(({ role, text }) => ({ role, content: text }));
|
||||
|
||||
try {
|
||||
await streamChat(llmId, messages, abortSignal,
|
||||
await llmStreamingChatGenerate(llmId, messages, null, null, abortSignal,
|
||||
(updatedMessage: Partial<DMessage>) => {
|
||||
// update the message in the store (and thus schedule a re-render)
|
||||
editMessage(updatedMessage);
|
||||
|
||||
@@ -78,14 +78,14 @@ export function AppNews() {
|
||||
|
||||
{!!news && <Container disableGutters maxWidth='sm'>
|
||||
{news?.map((ni, idx) => {
|
||||
const firstCard = idx === 0;
|
||||
// const firstCard = idx === 0;
|
||||
const hasCardAfter = news.length < NewsItems.length;
|
||||
const showExpander = hasCardAfter && (idx === news.length - 1);
|
||||
const addPadding = false; //!firstCard; // || showExpander;
|
||||
return <Card key={'news-' + idx} sx={{ mb: 2, minHeight: 32 }}>
|
||||
<CardContent sx={{ position: 'relative', pr: addPadding ? 4 : 0 }}>
|
||||
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 1 }}>
|
||||
<GoodTooltip title={ni.versionName || null} placement='top-start'>
|
||||
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 0 }}>
|
||||
<GoodTooltip title={ni.versionName ? `${ni.versionName} ${ni.versionMoji || ''}` : null} placement='top-start'>
|
||||
<Typography level='title-sm' component='div' sx={{ flexGrow: 1 }}>
|
||||
{ni.text ? ni.text : ni.versionName ? `${ni.versionCode} · ${ni.versionName}` : `Version ${ni.versionCode}:`}
|
||||
</Typography>
|
||||
|
||||
+28
-12
@@ -10,10 +10,10 @@ import { platformAwareKeystrokes } from '~/common/components/KeyStroke';
|
||||
|
||||
|
||||
// update this variable every time you want to broadcast a new version to clients
|
||||
export const incrementalVersion: number = 8;
|
||||
export const incrementalVersion: number = 9;
|
||||
|
||||
const B = (props: { href?: string, children: React.ReactNode }) => {
|
||||
const boldText = <Typography color={!!props.href ? 'primary' : 'warning'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
|
||||
const boldText = <Typography color={!!props.href ? 'primary' : 'neutral'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
|
||||
return props.href ?
|
||||
<Link href={props.href + clientUtmSource()} target='_blank' sx={{ /*textDecoration: 'underline'*/ }}>{boldText} <LaunchIcon sx={{ ml: 1 }} /></Link> :
|
||||
boldText;
|
||||
@@ -27,11 +27,12 @@ const RIssues = `${OpenRepo}/issues`;
|
||||
export const newsCallout =
|
||||
<Card>
|
||||
<CardContent sx={{ gap: 2 }}>
|
||||
<Typography level='h4'>
|
||||
<Typography level='title-lg'>
|
||||
Open Roadmap
|
||||
</Typography>
|
||||
<Typography>
|
||||
The roadmap is officially out. For the first time you get a look at what's brewing, up and coming, and get a chance to pick up cool features!
|
||||
<Typography level='body-md'>
|
||||
Take a peek at our roadmap to see what's in the pipeline.
|
||||
Discover upcoming features and let us know what excites you the most!
|
||||
</Typography>
|
||||
<Grid container spacing={1}>
|
||||
<Grid xs={12} sm={7}>
|
||||
@@ -39,7 +40,7 @@ export const newsCallout =
|
||||
fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
|
||||
component={Link} href={OpenProject} noLinkStyle target='_blank'
|
||||
>
|
||||
Explore the Roadmap
|
||||
Explore
|
||||
</Button>
|
||||
</Grid>
|
||||
<Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
|
||||
@@ -67,10 +68,27 @@ export const NewsItems: NewsItem[] = [
|
||||
],
|
||||
},*/
|
||||
{
|
||||
versionCode: '1.7.3',
|
||||
versionCode: '1.8.0',
|
||||
versionName: 'To The Moon And Back',
|
||||
versionMoji: '🚀🌕🔙❤️',
|
||||
versionDate: new Date('2023-12-20T09:30:00Z'),
|
||||
items: [
|
||||
{ text: <><B href={RIssues + '/275'}>Google Gemini</B> models support</> },
|
||||
{ text: <><B href={RIssues + '/273'}>Mistral Platform</B> support</> },
|
||||
{ text: <><B href={RIssues + '/270'}>Ollama chats</B> perfection</> },
|
||||
{ text: <>Custom <B href={RIssues + '/280'}>diagrams instructions</B> (@joriskalz)</> },
|
||||
{ text: <><B>Single-Tab</B> mode, enhances data integrity and prevents DB corruption</> },
|
||||
{ text: <>Updated Ollama (v0.1.17) and OpenRouter models</> },
|
||||
{ text: <>More: fixed ⌘ shortcuts on Mac</> },
|
||||
{ text: <><Link href='https://big-agi.com'>Website</Link>: official downloads</> },
|
||||
{ text: <>Easier Vercel deployment, documented <Link href='https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483'>network troubleshooting</Link></>, dev: true },
|
||||
],
|
||||
},
|
||||
{
|
||||
versionCode: '1.7.0',
|
||||
versionName: 'Attachment Theory',
|
||||
versionDate: new Date('2023-12-11T06:00:00Z'), // new Date().toISOString()
|
||||
// versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
|
||||
// versionDate: new Date('2023-12-11T06:00:00Z'), // 1.7.3
|
||||
versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
|
||||
items: [
|
||||
{ text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
|
||||
{ text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
|
||||
@@ -80,9 +98,6 @@ export const NewsItems: NewsItem[] = [
|
||||
{ text: <>{platformAwareKeystrokes('Ctrl+Shift+O')}: quick access to model options</> },
|
||||
{ text: <>Optimized voice input and performance</> },
|
||||
{ text: <>Latest Ollama and Oobabooga models</> },
|
||||
{ text: <>1.7.1: Improved <B href={RIssues + '/270'}>Ollama chats</B></> },
|
||||
{ text: <>1.7.2: Updated OpenRouter models 🎁</> },
|
||||
{ text: <>1.7.3: <B href={RIssues + '/273'}>Mistral Platform</B> support</> },
|
||||
],
|
||||
},
|
||||
{
|
||||
@@ -162,6 +177,7 @@ export const NewsItems: NewsItem[] = [
|
||||
interface NewsItem {
|
||||
versionCode: string;
|
||||
versionName?: string;
|
||||
versionMoji?: string;
|
||||
versionDate?: Date;
|
||||
text?: string | React.JSX.Element;
|
||||
items?: {
|
||||
|
||||
@@ -1,14 +1,13 @@
|
||||
import * as React from 'react';
|
||||
import { shallow } from 'zustand/shallow';
|
||||
import { useRouter } from 'next/router';
|
||||
|
||||
import { navigateToNews } from '~/common/app.routes';
|
||||
import { useAppStateStore } from '~/common/state/store-appstate';
|
||||
|
||||
import { incrementalVersion } from './news.data';
|
||||
|
||||
|
||||
export function useShowNewsOnUpdate() {
|
||||
const { push: routerPush } = useRouter();
|
||||
const { usageCount, lastSeenNewsVersion } = useAppStateStore(state => ({
|
||||
usageCount: state.usageCount,
|
||||
lastSeenNewsVersion: state.lastSeenNewsVersion,
|
||||
@@ -17,9 +16,9 @@ export function useShowNewsOnUpdate() {
|
||||
const isNewsOutdated = (lastSeenNewsVersion || 0) < incrementalVersion;
|
||||
if (isNewsOutdated && usageCount > 2) {
|
||||
// Disable for now
|
||||
void routerPush('/news');
|
||||
void navigateToNews();
|
||||
}
|
||||
}, [lastSeenNewsVersion, routerPush, usageCount]);
|
||||
}, [lastSeenNewsVersion, usageCount]);
|
||||
}
|
||||
|
||||
export function useMarkNewsAsSeen() {
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export interface LLMChainStep {
|
||||
@@ -80,7 +80,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
|
||||
_chainAbortController.signal.addEventListener('abort', globalToStepListener);
|
||||
|
||||
// LLM call
|
||||
callChatGenerate(llmId, llmChatInput, chain.overrideResponseTokens)
|
||||
llmChatGenerateOrThrow(llmId, llmChatInput, null, null, chain.overrideResponseTokens)
|
||||
.then(({ content }) => {
|
||||
stepDone = true;
|
||||
if (!stepAbortController.signal.aborted)
|
||||
|
||||
@@ -7,6 +7,7 @@
|
||||
import Router from 'next/router';
|
||||
|
||||
import type { DConversationId } from '~/common/state/store-chats';
|
||||
import { isBrowser } from './util/pwaUtils';
|
||||
|
||||
|
||||
export const ROUTE_INDEX = '/';
|
||||
@@ -15,7 +16,8 @@ export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
|
||||
export const ROUTE_APP_NEWS = '/news';
|
||||
const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';
|
||||
|
||||
export const getIndexLink = () => ROUTE_INDEX;
|
||||
|
||||
// Get Paths
|
||||
|
||||
export const getCallbackUrl = (source: 'openrouter') => {
|
||||
const callbackUrl = new URL(window.location.href);
|
||||
@@ -31,10 +33,11 @@ export const getCallbackUrl = (source: 'openrouter') => {
|
||||
|
||||
export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);
|
||||
|
||||
const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
|
||||
Router[replace ? 'replace' : 'push'](path);
|
||||
|
||||
/// Simple Navigation
|
||||
|
||||
export const navigateToIndex = navigateFn(ROUTE_INDEX);
|
||||
|
||||
export const navigateToChat = async (conversationId?: DConversationId) => {
|
||||
if (conversationId) {
|
||||
await Router.push(
|
||||
@@ -54,6 +57,15 @@ export const navigateToNews = navigateFn(ROUTE_APP_NEWS);
|
||||
|
||||
export const navigateBack = Router.back;
|
||||
|
||||
export const reloadPage = () => isBrowser && window.location.reload();
|
||||
|
||||
function navigateFn(path: string) {
|
||||
return (replace?: boolean): Promise<boolean> => Router[replace ? 'replace' : 'push'](path);
|
||||
}
|
||||
|
||||
|
||||
/// Launch Apps
|
||||
|
||||
export interface AppCallQueryParams {
|
||||
conversationId: string;
|
||||
personaId: string;
|
||||
|
||||
@@ -21,8 +21,13 @@ export const useGlobalShortcut = (shortcutKey: string | false, useCtrl: boolean,
|
||||
if (!shortcutKey) return;
|
||||
const lcShortcut = shortcutKey.toLowerCase();
|
||||
const handleKeyDown = (event: KeyboardEvent) => {
|
||||
if ((useCtrl === event.ctrlKey) && (useShift === event.shiftKey) && (useAlt === event.altKey)
|
||||
&& event.key.toLowerCase() === lcShortcut) {
|
||||
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
|
||||
if (
|
||||
(useCtrl === isCtrlOrCmd) &&
|
||||
(useShift === event.shiftKey) &&
|
||||
(useAlt === event.altKey) &&
|
||||
event.key.toLowerCase() === lcShortcut
|
||||
) {
|
||||
event.preventDefault();
|
||||
event.stopPropagation();
|
||||
callback();
|
||||
@@ -46,9 +51,10 @@ export const useGlobalShortcuts = (shortcuts: GlobalShortcutItem[]) => {
|
||||
React.useEffect(() => {
|
||||
const handleKeyDown = (event: KeyboardEvent) => {
|
||||
for (const [key, useCtrl, useShift, useAlt, action] of shortcuts) {
|
||||
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
|
||||
if (
|
||||
key &&
|
||||
(useCtrl === event.ctrlKey) &&
|
||||
(useCtrl === isCtrlOrCmd) &&
|
||||
(useShift === event.shiftKey) &&
|
||||
(useAlt === event.altKey) &&
|
||||
event.key.toLowerCase() === key.toLowerCase()
|
||||
|
||||
@@ -0,0 +1,95 @@
|
||||
import * as React from 'react';
|
||||
|
||||
/**
|
||||
* The AloneDetector class checks if the current client is the only one present for a given app. It uses
|
||||
* BroadcastChannel to talk to other clients. If no other clients reply within a short time, it assumes it's
|
||||
* the only one and tells the caller.
|
||||
*/
|
||||
class AloneDetector {
|
||||
private readonly clientId: string;
|
||||
private readonly broadcastChannel: BroadcastChannel;
|
||||
|
||||
private aloneCallback: ((isAlone: boolean) => void) | null;
|
||||
private aloneTimerId: number | undefined;
|
||||
|
||||
constructor(channelName: string, onAlone: (isAlone: boolean) => void) {
|
||||
|
||||
this.clientId = Math.random().toString(36).substring(2, 10);
|
||||
this.aloneCallback = onAlone;
|
||||
|
||||
this.broadcastChannel = new BroadcastChannel(channelName);
|
||||
this.broadcastChannel.onmessage = this.handleIncomingMessage;
|
||||
|
||||
}
|
||||
|
||||
public onUnmount(): void {
|
||||
// close channel
|
||||
this.broadcastChannel.onmessage = null;
|
||||
this.broadcastChannel.close();
|
||||
|
||||
// clear timeout
|
||||
if (this.aloneTimerId)
|
||||
clearTimeout(this.aloneTimerId);
|
||||
|
||||
this.aloneTimerId = undefined;
|
||||
this.aloneCallback = null;
|
||||
}
|
||||
|
||||
public checkIfAlone(): void {
|
||||
|
||||
// triggers other clients
|
||||
this.broadcastChannel.postMessage({ type: 'CHECK', sender: this.clientId });
|
||||
|
||||
// if no response within 500ms, assume this client is alone
|
||||
this.aloneTimerId = window.setTimeout(() => {
|
||||
this.aloneTimerId = undefined;
|
||||
this.aloneCallback?.(true);
|
||||
}, 500);
|
||||
|
||||
}
|
||||
|
||||
private handleIncomingMessage = (event: MessageEvent): void => {
|
||||
|
||||
// ignore self messages
|
||||
if (event.data.sender === this.clientId) return;
|
||||
|
||||
switch (event.data.type) {
|
||||
|
||||
case 'CHECK':
|
||||
this.broadcastChannel.postMessage({ type: 'ALIVE', sender: this.clientId });
|
||||
break;
|
||||
|
||||
case 'ALIVE':
|
||||
// received an ALIVE message, tell the client they're not alone
|
||||
if (this.aloneTimerId) {
|
||||
clearTimeout(this.aloneTimerId);
|
||||
this.aloneTimerId = undefined;
|
||||
}
|
||||
this.aloneCallback?.(false);
|
||||
this.aloneCallback = null;
|
||||
break;
|
||||
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* React hook that checks whether the current tab is the only one open for a specific channel.
|
||||
*
|
||||
* @param {string} channelName - The name of the BroadcastChannel to communicate on.
|
||||
* @returns {boolean | null} - True if the current tab is alone, false if not, or null before the check completes.
|
||||
*/
|
||||
export function useSingleTabEnforcer(channelName: string): boolean | null {
|
||||
const [isAlone, setIsAlone] = React.useState<boolean | null>(null);
|
||||
|
||||
React.useEffect(() => {
|
||||
const tabManager = new AloneDetector(channelName, setIsAlone);
|
||||
tabManager.checkIfAlone();
|
||||
return () => {
|
||||
tabManager.onUnmount();
|
||||
};
|
||||
}, [channelName]);
|
||||
|
||||
return isAlone;
|
||||
}
|
||||
@@ -3,7 +3,7 @@ import { shallow } from 'zustand/shallow';
|
||||
|
||||
import { Box, Container } from '@mui/joy';
|
||||
|
||||
import { ModelsModal } from '../../apps/models-modal/ModelsModal';
|
||||
import { ModelsModal } from '~/modules/llms/models-modal/ModelsModal';
|
||||
import { SettingsModal } from '../../apps/settings-modal/SettingsModal';
|
||||
import { ShortcutsModal } from '../../apps/settings-modal/ShortcutsModal';
|
||||
|
||||
|
||||
@@ -0,0 +1,42 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Button, Sheet, Typography } from '@mui/joy';
|
||||
|
||||
import { Brand } from '../app.config';
|
||||
import { reloadPage } from '../app.routes';
|
||||
import { useSingleTabEnforcer } from '../components/useSingleTabEnforcer';
|
||||
|
||||
|
||||
export const ProviderSingleTab = (props: { children: React.ReactNode }) => {
|
||||
|
||||
// state
|
||||
const isSingleTab = useSingleTabEnforcer('big-agi-tabs');
|
||||
|
||||
// pass-through until we know for sure that other tabs are open
|
||||
if (isSingleTab === null || isSingleTab)
|
||||
return props.children;
|
||||
|
||||
|
||||
return (
|
||||
<Sheet
|
||||
variant='solid'
|
||||
invertedColors
|
||||
sx={{
|
||||
flexGrow: 1,
|
||||
display: 'flex', flexDirection: { xs: 'column', md: 'row' }, justifyContent: 'center', alignItems: 'center', gap: 2,
|
||||
p: 3,
|
||||
}}
|
||||
>
|
||||
|
||||
<Typography>
|
||||
It looks like {Brand.Title.Base} is already running in another tab or window.
|
||||
To continue here, please close the other instance first.
|
||||
</Typography>
|
||||
|
||||
<Button onClick={reloadPage}>
|
||||
Reload
|
||||
</Button>
|
||||
|
||||
</Sheet>
|
||||
);
|
||||
};
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerateWithFunctions, VChatFunctionIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
import { useChatStore } from '~/common/state/store-chats';
|
||||
@@ -71,7 +71,7 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
|
||||
// Follow-up: Question
|
||||
if (suggestQuestions) {
|
||||
// callChatGenerateWithFunctions(funcLLMId, [
|
||||
// llmChatGenerateOrThrow(funcLLMId, [
|
||||
// { role: 'system', content: systemMessage.text },
|
||||
// { role: 'user', content: userMessage.text },
|
||||
// { role: 'assistant', content: assistantMessageText },
|
||||
@@ -83,15 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
|
||||
// Follow-up: Auto-Diagrams
|
||||
if (suggestDiagrams) {
|
||||
void callChatGenerateWithFunctions(funcLLMId, [
|
||||
void llmChatGenerateOrThrow(funcLLMId, [
|
||||
{ role: 'system', content: systemMessage.text },
|
||||
{ role: 'user', content: userMessage.text },
|
||||
{ role: 'assistant', content: assistantMessageText },
|
||||
], [suggestPlantUMLFn], 'draw_plantuml_diagram',
|
||||
).then(chatResponse => {
|
||||
|
||||
if (!('function_arguments' in chatResponse))
|
||||
return;
|
||||
|
||||
// parse the output PlantUML string, if any
|
||||
const functionArguments = chatResponse?.function_arguments ?? null;
|
||||
const functionArguments = chatResponse.function_arguments ?? null;
|
||||
if (functionArguments) {
|
||||
const { code, type }: { code: string, type: string } = functionArguments as any;
|
||||
if (code && type) {
|
||||
@@ -105,6 +108,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
|
||||
editMessage(conversationId, assistantMessageId, { text: assistantMessageText }, false);
|
||||
}
|
||||
}
|
||||
}).catch(err => {
|
||||
console.error('autoSuggestions::diagram:', err);
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
import { useChatStore } from '~/common/state/store-chats';
|
||||
@@ -27,7 +27,7 @@ export function autoTitle(conversationId: string) {
|
||||
});
|
||||
|
||||
// LLM
|
||||
void callChatGenerate(fastLLMId, [
|
||||
void llmChatGenerateOrThrow(fastLLMId, [
|
||||
{ role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
|
||||
{
|
||||
role: 'user', content:
|
||||
@@ -39,7 +39,7 @@ export function autoTitle(conversationId: string) {
|
||||
historyLines.join('\n') +
|
||||
'```\n',
|
||||
},
|
||||
]).then(chatResponse => {
|
||||
], null, null).then(chatResponse => {
|
||||
|
||||
const title = chatResponse?.content
|
||||
?.trim()
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton } from '@mui/joy';
|
||||
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton, Input, FormControl, FormLabel } from '@mui/joy';
|
||||
import AccountTreeIcon from '@mui/icons-material/AccountTree';
|
||||
import ExpandLessIcon from '@mui/icons-material/ExpandLess';
|
||||
import ExpandMoreIcon from '@mui/icons-material/ExpandMore';
|
||||
@@ -8,8 +8,9 @@ import ReplayIcon from '@mui/icons-material/Replay';
|
||||
import StopOutlinedIcon from '@mui/icons-material/StopOutlined';
|
||||
import TelegramIcon from '@mui/icons-material/Telegram';
|
||||
|
||||
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
|
||||
|
||||
import { ChatMessage } from '../../../apps/chat/components/message/ChatMessage';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
|
||||
import { GoodModal } from '~/common/components/GoodModal';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
@@ -48,6 +49,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
const [message, setMessage] = React.useState<DMessage | null>(null);
|
||||
const [diagramType, diagramComponent] = useFormRadio<DiagramType>('auto', diagramTypes, 'Visualize');
|
||||
const [diagramLanguage, languageComponent] = useFormRadio<DiagramLanguage>('plantuml', diagramLanguages, 'Style');
|
||||
const [customInstruction, setCustomInstruction] = React.useState<string>('');
|
||||
const [errorMessage, setErrorMessage] = React.useState<string | null>(null);
|
||||
const [abortController, setAbortController] = React.useState<AbortController | null>(null);
|
||||
|
||||
@@ -81,10 +83,10 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
const stepAbortController = new AbortController();
|
||||
setAbortController(stepAbortController);
|
||||
|
||||
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject);
|
||||
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);
|
||||
|
||||
try {
|
||||
await streamChat(diagramLlm.id, diagramPrompt, stepAbortController.signal,
|
||||
await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
|
||||
(update: Partial<{ text: string, typing: boolean, originLLM: string }>) => {
|
||||
assistantMessage = { ...assistantMessage, ...update };
|
||||
setMessage(assistantMessage);
|
||||
@@ -103,7 +105,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
setAbortController(null);
|
||||
}
|
||||
|
||||
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject]);
|
||||
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);
|
||||
|
||||
|
||||
// [Effect] Auto-abort on unmount
|
||||
@@ -149,6 +151,12 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
|
||||
<Grid xs={12} xl={6}>
|
||||
{llmComponent}
|
||||
</Grid>
|
||||
<Grid xs={12} md={6}>
|
||||
<FormControl>
|
||||
<FormLabel>Custom Instruction</FormLabel>
|
||||
<Input title="Custom Instruction" placeholder='e.g. visualize as state' value={customInstruction} onChange={(e) => setCustomInstruction(e.target.value)} />
|
||||
</FormControl>
|
||||
</Grid>
|
||||
</Grid>
|
||||
)}
|
||||
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
|
||||
import type { FormRadioOption } from '~/common/components/forms/FormRadioControl';
|
||||
import type { VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export type DiagramType = 'auto' | 'mind';
|
||||
@@ -60,12 +59,15 @@ function plantumlDiagramPrompt(diagramType: DiagramType): { sys: string, usr: st
|
||||
}
|
||||
}
|
||||
|
||||
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string): VChatMessageIn[] {
|
||||
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string, customInstruction: string): VChatMessageIn[] {
|
||||
const { sys, usr } = diagramLanguage === 'mermaid' ? mermaidDiagramPrompt(diagramType) : plantumlDiagramPrompt(diagramType);
|
||||
if (customInstruction) {
|
||||
customInstruction = 'Also consider the following instructions: ' + customInstruction;
|
||||
}
|
||||
return [
|
||||
{ role: 'system', content: sys },
|
||||
{ role: 'system', content: chatSystemPrompt },
|
||||
{ role: 'assistant', content: subject },
|
||||
{ role: 'user', content: usr },
|
||||
{ role: 'user', content: `${usr} ${customInstruction}` },
|
||||
];
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
import { useModelsStore } from '~/modules/llms/store-llms';
|
||||
|
||||
|
||||
@@ -14,10 +14,10 @@ export async function imaginePromptFromText(messageText: string): Promise<string
|
||||
const { fastLLMId } = useModelsStore.getState();
|
||||
if (!fastLLMId) return null;
|
||||
try {
|
||||
const chatResponse = await callChatGenerate(fastLLMId, [
|
||||
const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
|
||||
{ role: 'system', content: simpleImagineSystemPrompt },
|
||||
{ role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
|
||||
]);
|
||||
], null, null);
|
||||
return chatResponse.content?.trim() ?? null;
|
||||
} catch (error: any) {
|
||||
console.error('imaginePromptFromText: fetch request error:', error);
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
import { DLLMId } from '~/modules/llms/store-llms';
|
||||
import { callApiSearchGoogle } from '~/modules/google/search.client';
|
||||
import { callBrowseFetchPage } from '~/modules/browse/browse.client';
|
||||
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
// prompt to implement the ReAct paradigm: https://arxiv.org/abs/2210.03629
|
||||
@@ -128,7 +128,7 @@ export class Agent {
|
||||
S.messages.push({ role: 'user', content: prompt });
|
||||
let content: string;
|
||||
try {
|
||||
content = (await callChatGenerate(llmId, S.messages, 500)).content;
|
||||
content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
|
||||
} catch (error: any) {
|
||||
content = `Error in callChat: ${error}`;
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
|
||||
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
|
||||
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
// prompt to be tried when doing recursive summerization.
|
||||
@@ -80,10 +80,10 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
|
||||
const autoResponseTokensSize = Math.floor(contextTokens * outputTokenShare);
|
||||
|
||||
try {
|
||||
const chatResponse = await callChatGenerate(llmId, [
|
||||
const chatResponse = await llmChatGenerateOrThrow(llmId, [
|
||||
{ role: 'system', content: cleanupPrompt },
|
||||
{ role: 'user', content: chunk },
|
||||
], autoResponseTokensSize);
|
||||
], null, null, autoResponseTokensSize);
|
||||
return chatResponse?.content ?? '';
|
||||
} catch (error: any) {
|
||||
return '';
|
||||
|
||||
@@ -1,8 +1,7 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import type { DLLMId } from '~/modules/llms/store-llms';
|
||||
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
|
||||
import { streamChat } from '~/modules/llms/transports/streamChat';
|
||||
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export function useStreamChatText() {
|
||||
@@ -25,7 +24,7 @@ export function useStreamChatText() {
|
||||
|
||||
try {
|
||||
let lastText = '';
|
||||
await streamChat(llmId, prompt, abortControllerRef.current.signal, (update) => {
|
||||
await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, (update) => {
|
||||
if (update.text) {
|
||||
lastText = update.text;
|
||||
setPartialText(lastText);
|
||||
|
||||
@@ -28,6 +28,7 @@ export const backendRouter = createTRPCRouter({
|
||||
hasImagingProdia: !!env.PRODIA_API_KEY,
|
||||
hasLlmAnthropic: !!env.ANTHROPIC_API_KEY,
|
||||
hasLlmAzureOpenAI: !!env.AZURE_OPENAI_API_KEY && !!env.AZURE_OPENAI_API_ENDPOINT,
|
||||
hasLlmGemini: !!env.GEMINI_API_KEY,
|
||||
hasLlmMistral: !!env.MISTRAL_API_KEY,
|
||||
hasLlmOllama: !!env.OLLAMA_API_HOST,
|
||||
hasLlmOpenAI: !!env.OPENAI_API_KEY || !!env.OPENAI_API_HOST,
|
||||
@@ -42,7 +43,7 @@ export const backendRouter = createTRPCRouter({
|
||||
/* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
|
||||
exchangeOpenRouterKey: publicProcedure
|
||||
.input(z.object({ code: z.string() }))
|
||||
.query(async ({ ctx, input }) => {
|
||||
.query(async ({ input }) => {
|
||||
// Documented here: https://openrouter.ai/docs#oauth
|
||||
return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
|
||||
code: input.code,
|
||||
|
||||
@@ -9,6 +9,7 @@ export interface BackendCapabilities {
|
||||
hasImagingProdia: boolean;
|
||||
hasLlmAnthropic: boolean;
|
||||
hasLlmAzureOpenAI: boolean;
|
||||
hasLlmGemini: boolean;
|
||||
hasLlmMistral: boolean;
|
||||
hasLlmOllama: boolean;
|
||||
hasLlmOpenAI: boolean;
|
||||
@@ -31,6 +32,7 @@ const useBackendStore = create<BackendStore>()(
|
||||
hasImagingProdia: false,
|
||||
hasLlmAnthropic: false,
|
||||
hasLlmAzureOpenAI: false,
|
||||
hasLlmGemini: false,
|
||||
hasLlmMistral: false,
|
||||
hasLlmOllama: false,
|
||||
hasLlmOpenAI: false,
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import create from 'zustand';
|
||||
import { create } from 'zustand';
|
||||
import { persist } from 'zustand/middleware';
|
||||
|
||||
import { CapabilityBrowsing } from '~/common/components/useCapabilities';
|
||||
|
||||
@@ -0,0 +1,74 @@
|
||||
import type { DLLMId } from './store-llms';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import { findVendorForLlmOrThrow } from './vendors/vendors.registry';
|
||||
|
||||
|
||||
// LLM Client Types
|
||||
// NOTE: Model List types in '../server/llm.server.types';
|
||||
|
||||
export interface VChatMessageIn {
|
||||
role: 'assistant' | 'system' | 'user'; // | 'function';
|
||||
content: string;
|
||||
//name?: string; // when role: 'function'
|
||||
}
|
||||
|
||||
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
|
||||
|
||||
export interface VChatMessageOut {
|
||||
role: 'assistant' | 'system' | 'user';
|
||||
content: string;
|
||||
finish_reason: 'stop' | 'length' | null;
|
||||
}
|
||||
|
||||
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
|
||||
function_name: string;
|
||||
function_arguments: object | null;
|
||||
}
|
||||
|
||||
|
||||
// LLM Client Functions
|
||||
|
||||
export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
llmId: DLLMId,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {
|
||||
|
||||
// id to DLLM and vendor
|
||||
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
|
||||
|
||||
// FIXME: relax the forced cast
|
||||
const options = llm.options as TLLMOptions;
|
||||
|
||||
// get the access
|
||||
const partialSourceSetup = llm._source.setup;
|
||||
const access = vendor.getTransportAccess(partialSourceSetup);
|
||||
|
||||
// execute via the vendor
|
||||
return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
|
||||
}
|
||||
|
||||
|
||||
export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
llmId: DLLMId,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null,
|
||||
forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
): Promise<void> {
|
||||
|
||||
// id to DLLM and vendor
|
||||
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
|
||||
|
||||
// FIXME: relax the forced cast
|
||||
const llmOptions = llm.options as TLLMOptions;
|
||||
|
||||
// get the access
|
||||
const partialSourceSetup = llm._source.setup;
|
||||
const access = vendor.getTransportAccess(partialSourceSetup); // as ChatStreamInputSchema['access'];
|
||||
|
||||
// execute via the vendor
|
||||
return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
|
||||
}
|
||||
+3
-3
@@ -117,9 +117,9 @@ export function LLMOptionsModal(props: { id: DLLMId }) {
|
||||
<FormLabelStart title='Details' sx={{ minWidth: 80 }} onClick={() => setShowDetails(!showDetails)} />
|
||||
{showDetails && <Typography level='body-sm' sx={{ display: 'block' }}>
|
||||
[{llm.id}]: {llm.options.llmRef && `${llm.options.llmRef} · `}
|
||||
{llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
|
||||
{llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
|
||||
{llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
|
||||
{!!llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
|
||||
{!!llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
|
||||
{!!llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
|
||||
description: {llm.description}
|
||||
{/*· tags: {llm.tags.join(', ')}*/}
|
||||
</Typography>}
|
||||
@@ -111,7 +111,13 @@ export function ModelsList(props: {
|
||||
pl: { xs: 0, md: 1 },
|
||||
overflowY: 'auto',
|
||||
}}>
|
||||
{items}
|
||||
{items.length > 0 ? items : (
|
||||
<ListItem>
|
||||
<Typography level='body-sm'>
|
||||
Please configure the service and update the list of models.
|
||||
</Typography>
|
||||
</ListItem>
|
||||
)}
|
||||
</List>
|
||||
);
|
||||
}
|
||||
+1
-1
@@ -65,7 +65,7 @@ export function ModelsModal(props: { suspendAutoModelsSetup?: boolean }) {
|
||||
title={<>Configure <b>AI Models</b></>}
|
||||
startButton={
|
||||
multiSource ? <Checkbox
|
||||
label='all vendors' sx={{ my: 'auto' }}
|
||||
label='All Services' sx={{ my: 'auto' }}
|
||||
checked={showAllSources} onChange={() => setShowAllSources(all => !all)}
|
||||
/> : undefined
|
||||
}
|
||||
+3
-3
@@ -5,9 +5,9 @@ import { Avatar, Badge, Box, Button, IconButton, ListItemDecorator, MenuItem, Op
|
||||
import AddIcon from '@mui/icons-material/Add';
|
||||
import DeleteOutlineIcon from '@mui/icons-material/DeleteOutline';
|
||||
|
||||
import { type DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { type IModelVendor, type ModelVendorId } from '~/modules/llms/vendors/IModelVendor';
|
||||
import { createModelSourceForVendor, findAllVendors, findVendorById } from '~/modules/llms/vendors/vendors.registry';
|
||||
import type { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
|
||||
import { DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
|
||||
import { createModelSourceForVendor, findAllVendors, findVendorById, ModelVendorId } from '~/modules/llms/vendors/vendors.registry';
|
||||
|
||||
import { CloseableMenu } from '~/common/components/CloseableMenu';
|
||||
import { ConfirmationModal } from '~/common/components/ConfirmationModal';
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
import type { ModelDescriptionSchema } from '../server.schemas';
|
||||
import type { ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { LLM_IF_OAI_Chat } from '../../../store-llms';
|
||||
import { LLM_IF_OAI_Chat } from '../../store-llms';
|
||||
|
||||
const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);
|
||||
|
||||
+1
-1
@@ -6,7 +6,7 @@ import { env } from '~/server/env.mjs';
|
||||
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
import { listModelsOutputSchema } from '../server.schemas';
|
||||
import { listModelsOutputSchema } from '../llm.server.types';
|
||||
|
||||
import { AnthropicWire } from './anthropic.wiretypes';
|
||||
import { hardcodedAnthropicModels } from './anthropic.models';
|
||||
@@ -0,0 +1,216 @@
|
||||
import { z } from 'zod';
|
||||
import { TRPCError } from '@trpc/server';
|
||||
import { env } from '~/server/env.mjs';
|
||||
|
||||
import packageJson from '../../../../../package.json';
|
||||
|
||||
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
|
||||
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
|
||||
import { GeminiBlockSafetyLevel, geminiBlockSafetyLevelSchema, GeminiContentSchema, GeminiGenerateContentRequest, geminiGeneratedContentResponseSchema, geminiModelsGenerateContentPath, geminiModelsListOutputSchema, geminiModelsListPath } from './gemini.wiretypes';
|
||||
|
||||
|
||||
// Default hosts
|
||||
const DEFAULT_GEMINI_HOST = 'https://generativelanguage.googleapis.com';
|
||||
|
||||
|
||||
// Mappers
|
||||
|
||||
export function geminiAccess(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string): { headers: HeadersInit, url: string } {
|
||||
|
||||
const geminiKey = access.geminiKey || env.GEMINI_API_KEY || '';
|
||||
const geminiHost = fixupHost(DEFAULT_GEMINI_HOST, apiPath);
|
||||
|
||||
// update model-dependent paths
|
||||
if (apiPath.includes('{model=models/*}')) {
|
||||
if (!modelRefId)
|
||||
throw new Error(`geminiAccess: modelRefId is required for ${apiPath}`);
|
||||
apiPath = apiPath.replace('{model=models/*}', modelRefId);
|
||||
}
|
||||
|
||||
return {
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'x-goog-api-client': `big-agi/${packageJson['version'] || '1.0.0'}`,
|
||||
'x-goog-api-key': geminiKey,
|
||||
},
|
||||
url: geminiHost + apiPath,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* We specially encode the history to match the Gemini API requirements.
|
||||
* Gemini does not want 2 consecutive messages from the same role, so we alternate.
|
||||
* - System messages = [User, Model'Ok']
|
||||
* - User and Assistant messages are coalesced into a single message (e.g. [User, User, Assistant, Assistant, User] -> [User[2], Assistant[2], User[1]])
|
||||
*/
|
||||
export const geminiGenerateContentTextPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, safety: GeminiBlockSafetyLevel, n: number): GeminiGenerateContentRequest => {
|
||||
|
||||
// convert the history to a Gemini format
|
||||
const contents: GeminiContentSchema[] = [];
|
||||
for (const _historyElement of history) {
|
||||
|
||||
const { role: msgRole, content: msgContent } = _historyElement;
|
||||
|
||||
// System message - we treat it as per the example in https://ai.google.dev/tutorials/ai-studio_quickstart#chat_example
|
||||
if (msgRole === 'system') {
|
||||
contents.push({ role: 'user', parts: [{ text: msgContent }] });
|
||||
contents.push({ role: 'model', parts: [{ text: 'Ok' }] });
|
||||
continue;
|
||||
}
|
||||
|
||||
// User or Assistant message
|
||||
const nextRole: GeminiContentSchema['role'] = msgRole === 'assistant' ? 'model' : 'user';
|
||||
if (contents.length && contents[contents.length - 1].role === nextRole) {
|
||||
// coalesce with the previous message
|
||||
contents[contents.length - 1].parts.push({ text: msgContent });
|
||||
} else {
|
||||
// create a new message
|
||||
contents.push({ role: nextRole, parts: [{ text: msgContent }] });
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
contents,
|
||||
generationConfig: {
|
||||
...(n >= 2 && { candidateCount: n }),
|
||||
...(model.maxTokens && { maxOutputTokens: model.maxTokens }),
|
||||
temperature: model.temperature,
|
||||
},
|
||||
safetySettings: safety !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED' ? [
|
||||
{ category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_HATE_SPEECH', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_HARASSMENT', threshold: safety },
|
||||
{ category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: safety },
|
||||
] : undefined,
|
||||
};
|
||||
};
|
||||
|
||||
|
||||
async function geminiGET<TOut extends object>(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
|
||||
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
|
||||
return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, 'Gemini');
|
||||
}
|
||||
|
||||
async function geminiPOST<TOut extends object, TPostBody extends object>(access: GeminiAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
|
||||
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
|
||||
return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, 'Gemini');
|
||||
}
|
||||
|
||||
|
||||
// Input/Output Schemas
|
||||
|
||||
export const geminiAccessSchema = z.object({
|
||||
dialect: z.enum(['gemini']),
|
||||
geminiKey: z.string(),
|
||||
minSafetyLevel: geminiBlockSafetyLevelSchema,
|
||||
});
|
||||
export type GeminiAccessSchema = z.infer<typeof geminiAccessSchema>;
|
||||
|
||||
|
||||
const accessOnlySchema = z.object({
|
||||
access: geminiAccessSchema,
|
||||
});
|
||||
|
||||
const chatGenerateInputSchema = z.object({
|
||||
access: geminiAccessSchema,
|
||||
model: openAIModelSchema, history: openAIHistorySchema,
|
||||
// functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
|
||||
});
|
||||
|
||||
|
||||
/**
|
||||
* See https://github.com/google/generative-ai-js/tree/main/packages/main/src for
|
||||
* the official Google implementation.
|
||||
*/
|
||||
export const llmGeminiRouter = createTRPCRouter({
|
||||
|
||||
/* [Gemini] models.list = /v1beta/models */
|
||||
listModels: publicProcedure
|
||||
.input(accessOnlySchema)
|
||||
.output(listModelsOutputSchema)
|
||||
.query(async ({ input }) => {
|
||||
|
||||
// get the models
|
||||
const wireModels = await geminiGET(input.access, null, geminiModelsListPath);
|
||||
const detailedModels = geminiModelsListOutputSchema.parse(wireModels).models;
|
||||
|
||||
// NOTE: no need to retrieve info for each of the models (e.g. /v1beta/model/gemini-pro).,
|
||||
// as the List API already all the info on all the models
|
||||
|
||||
// map to our output schema
|
||||
return {
|
||||
models: detailedModels.map((geminiModel) => {
|
||||
const { description, displayName, inputTokenLimit, name, outputTokenLimit, supportedGenerationMethods } = geminiModel;
|
||||
|
||||
const contextWindow = inputTokenLimit + outputTokenLimit;
|
||||
const hidden = !supportedGenerationMethods.includes('generateContent');
|
||||
|
||||
const { version, topK, topP, temperature } = geminiModel;
|
||||
const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;
|
||||
|
||||
// const isGeminiPro = name.includes('gemini-pro');
|
||||
const isGeminiProVision = name.includes('gemini-pro-vision');
|
||||
|
||||
const interfaces: ModelDescriptionSchema['interfaces'] = [];
|
||||
if (supportedGenerationMethods.includes('generateContent')) {
|
||||
interfaces.push(LLM_IF_OAI_Chat);
|
||||
if (isGeminiProVision)
|
||||
interfaces.push(LLM_IF_OAI_Vision);
|
||||
}
|
||||
|
||||
return {
|
||||
id: name,
|
||||
label: displayName,
|
||||
// created: ...
|
||||
// updated: ...
|
||||
description: descriptionLong,
|
||||
contextWindow: contextWindow,
|
||||
maxCompletionTokens: outputTokenLimit,
|
||||
// pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
|
||||
// rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
|
||||
interfaces: supportedGenerationMethods.includes('generateContent') ? [LLM_IF_OAI_Chat] : [],
|
||||
hidden,
|
||||
} satisfies ModelDescriptionSchema;
|
||||
}),
|
||||
};
|
||||
}),
|
||||
|
||||
|
||||
/* [Gemini] models.generateContent = /v1/{model=models/*}:generateContent */
|
||||
chatGenerate: publicProcedure
|
||||
.input(chatGenerateInputSchema)
|
||||
.output(openAIChatGenerateOutputSchema)
|
||||
.mutation(async ({ input: { access, history, model } }) => {
|
||||
|
||||
// generate the content
|
||||
const wireGeneration = await geminiPOST(access, model.id, geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1), geminiModelsGenerateContentPath);
|
||||
const generation = geminiGeneratedContentResponseSchema.parse(wireGeneration);
|
||||
|
||||
// only use the first result (and there should be only one)
|
||||
const singleCandidate = generation.candidates?.[0] ?? null;
|
||||
if (!singleCandidate || !singleCandidate.content?.parts.length)
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Gemini chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
|
||||
});
|
||||
|
||||
if (!('text' in singleCandidate.content.parts[0]))
|
||||
throw new TRPCError({
|
||||
code: 'INTERNAL_SERVER_ERROR',
|
||||
message: `Gemini non-text chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
|
||||
});
|
||||
|
||||
return {
|
||||
role: 'assistant',
|
||||
content: singleCandidate.content.parts[0].text || '',
|
||||
finish_reason: singleCandidate.finishReason === 'STOP' ? 'stop' : null,
|
||||
};
|
||||
}),
|
||||
|
||||
});
|
||||
@@ -0,0 +1,188 @@
|
||||
import { z } from 'zod';
|
||||
|
||||
// PATHS
|
||||
|
||||
export const geminiModelsListPath = '/v1beta/models?pageSize=1000';
|
||||
export const geminiModelsGenerateContentPath = '/v1beta/{model=models/*}:generateContent';
|
||||
// see alt=sse on https://cloud.google.com/apis/docs/system-parameters#definitions
|
||||
export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:streamGenerateContent?alt=sse';
|
||||
|
||||
|
||||
// models.list = /v1beta/models
|
||||
|
||||
export const geminiModelsListOutputSchema = z.object({
|
||||
models: z.array(z.object({
|
||||
name: z.string(),
|
||||
version: z.string(),
|
||||
displayName: z.string(),
|
||||
description: z.string(),
|
||||
inputTokenLimit: z.number().int().min(1),
|
||||
outputTokenLimit: z.number().int().min(1),
|
||||
supportedGenerationMethods: z.array(z.enum([
|
||||
'countMessageTokens',
|
||||
'countTextTokens',
|
||||
'countTokens',
|
||||
'createTunedTextModel',
|
||||
'embedContent',
|
||||
'embedText',
|
||||
'generateAnswer',
|
||||
'generateContent',
|
||||
'generateMessage',
|
||||
'generateText',
|
||||
])),
|
||||
temperature: z.number().optional(),
|
||||
topP: z.number().optional(),
|
||||
topK: z.number().optional(),
|
||||
})),
|
||||
});
|
||||
|
||||
|
||||
// /v1/{model=models/*}:generateContent, /v1beta/{model=models/*}:streamGenerateContent
|
||||
|
||||
// Request
|
||||
|
||||
const geminiContentPartSchema = z.union([
|
||||
|
||||
// TextPart
|
||||
z.object({
|
||||
text: z.string().optional(),
|
||||
}),
|
||||
|
||||
// InlineDataPart
|
||||
z.object({
|
||||
inlineData: z.object({
|
||||
mimeType: z.string(),
|
||||
data: z.string(), // base64-encoded string
|
||||
}),
|
||||
}),
|
||||
|
||||
// A predicted FunctionCall returned from the model
|
||||
z.object({
|
||||
functionCall: z.object({
|
||||
name: z.string(),
|
||||
args: z.record(z.any()), // JSON object format
|
||||
}),
|
||||
}),
|
||||
|
||||
// The result output of a FunctionCall
|
||||
z.object({
|
||||
functionResponse: z.object({
|
||||
name: z.string(),
|
||||
response: z.record(z.any()), // JSON object format
|
||||
}),
|
||||
}),
|
||||
]);
|
||||
|
||||
const geminiToolSchema = z.object({
|
||||
functionDeclarations: z.array(z.object({
|
||||
name: z.string(),
|
||||
description: z.string(),
|
||||
parameters: z.record(z.any()).optional(), // Schema object format
|
||||
})).optional(),
|
||||
});
|
||||
|
||||
const geminiHarmCategorySchema = z.enum([
|
||||
'HARM_CATEGORY_UNSPECIFIED',
|
||||
'HARM_CATEGORY_DEROGATORY',
|
||||
'HARM_CATEGORY_TOXICITY',
|
||||
'HARM_CATEGORY_VIOLENCE',
|
||||
'HARM_CATEGORY_SEXUAL',
|
||||
'HARM_CATEGORY_MEDICAL',
|
||||
'HARM_CATEGORY_DANGEROUS',
|
||||
'HARM_CATEGORY_HARASSMENT',
|
||||
'HARM_CATEGORY_HATE_SPEECH',
|
||||
'HARM_CATEGORY_SEXUALLY_EXPLICIT',
|
||||
'HARM_CATEGORY_DANGEROUS_CONTENT',
|
||||
]);
|
||||
|
||||
export const geminiBlockSafetyLevelSchema = z.enum([
|
||||
'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
'BLOCK_LOW_AND_ABOVE',
|
||||
'BLOCK_MEDIUM_AND_ABOVE',
|
||||
'BLOCK_ONLY_HIGH',
|
||||
'BLOCK_NONE',
|
||||
]);
|
||||
|
||||
export type GeminiBlockSafetyLevel = z.infer<typeof geminiBlockSafetyLevelSchema>;
|
||||
|
||||
const geminiSafetySettingSchema = z.object({
|
||||
category: geminiHarmCategorySchema,
|
||||
threshold: geminiBlockSafetyLevelSchema,
|
||||
});
|
||||
|
||||
const geminiGenerationConfigSchema = z.object({
|
||||
stopSequences: z.array(z.string()).optional(),
|
||||
candidateCount: z.number().int().optional(),
|
||||
maxOutputTokens: z.number().int().optional(),
|
||||
temperature: z.number().optional(),
|
||||
topP: z.number().optional(),
|
||||
topK: z.number().int().optional(),
|
||||
});
|
||||
|
||||
const geminiContentSchema = z.object({
|
||||
// Must be either 'user' or 'model'. Optional but must be set if there are multiple "Content" objects in the parent array.
|
||||
role: z.enum(['user', 'model']).optional(),
|
||||
// Ordered Parts that constitute a single message. Parts may have different MIME types.
|
||||
parts: z.array(geminiContentPartSchema),
|
||||
});
|
||||
|
||||
export type GeminiContentSchema = z.infer<typeof geminiContentSchema>;
|
||||
|
||||
export const geminiGenerateContentRequest = z.object({
|
||||
contents: z.array(geminiContentSchema),
|
||||
tools: z.array(geminiToolSchema).optional(),
|
||||
safetySettings: z.array(geminiSafetySettingSchema).optional(),
|
||||
generationConfig: geminiGenerationConfigSchema.optional(),
|
||||
});
|
||||
|
||||
export type GeminiGenerateContentRequest = z.infer<typeof geminiGenerateContentRequest>;
|
||||
|
||||
|
||||
// Response
|
||||
|
||||
const geminiHarmProbabilitySchema = z.enum([
|
||||
'HARM_PROBABILITY_UNSPECIFIED',
|
||||
'NEGLIGIBLE',
|
||||
'LOW',
|
||||
'MEDIUM',
|
||||
'HIGH',
|
||||
]);
|
||||
|
||||
const geminiSafetyRatingSchema = z.object({
|
||||
'category': geminiHarmCategorySchema,
|
||||
'probability': geminiHarmProbabilitySchema,
|
||||
'blocked': z.boolean().optional(),
|
||||
});
|
||||
|
||||
const geminiFinishReasonSchema = z.enum([
|
||||
'FINISH_REASON_UNSPECIFIED',
|
||||
'STOP',
|
||||
'MAX_TOKENS',
|
||||
'SAFETY',
|
||||
'RECITATION',
|
||||
'OTHER',
|
||||
]);
|
||||
|
||||
export const geminiGeneratedContentResponseSchema = z.object({
|
||||
// either all requested candidates are returned or no candidates at all
|
||||
// no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
|
||||
candidates: z.array(z.object({
|
||||
index: z.number(),
|
||||
content: geminiContentSchema,
|
||||
finishReason: geminiFinishReasonSchema.optional(),
|
||||
safetyRatings: z.array(geminiSafetyRatingSchema),
|
||||
citationMetadata: z.object({
|
||||
startIndex: z.number().optional(),
|
||||
endIndex: z.number().optional(),
|
||||
uri: z.string().optional(),
|
||||
license: z.string().optional(),
|
||||
}).optional(),
|
||||
tokenCount: z.number().optional(),
|
||||
// groundingAttributions: z.array(GroundingAttribution).optional(), // This field is populated for GenerateAnswer calls.
|
||||
})).optional(),
|
||||
// NOTE: promptFeedback is only send in the first chunk in a streaming response
|
||||
promptFeedback: z.object({
|
||||
blockReason: z.enum(['BLOCK_REASON_UNSPECIFIED', 'SAFETY', 'OTHER']).optional(),
|
||||
safetyRatings: z.array(geminiSafetyRatingSchema).optional(),
|
||||
}).optional(),
|
||||
});
|
||||
+235
-165
@@ -4,12 +4,30 @@ import { createParser as createEventsourceParser, EventSourceParseCallback, Even
|
||||
|
||||
import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE, serverFetchOrThrow } from '~/server/wire';
|
||||
|
||||
import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
|
||||
import type { OpenAIWire } from './openai.wiretypes';
|
||||
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
|
||||
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
|
||||
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
|
||||
import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
|
||||
|
||||
// Anthropic server imports
|
||||
import type { AnthropicWire } from './anthropic/anthropic.wiretypes';
|
||||
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from './anthropic/anthropic.router';
|
||||
|
||||
// Gemini server imports
|
||||
import { geminiAccess, geminiAccessSchema, geminiGenerateContentTextPayload } from './gemini/gemini.router';
|
||||
import { geminiGeneratedContentResponseSchema, geminiModelsStreamGenerateContentPath } from './gemini/gemini.wiretypes';
|
||||
|
||||
// Ollama server imports
|
||||
import { wireOllamaChunkedOutputSchema } from './ollama/ollama.wiretypes';
|
||||
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from './ollama/ollama.router';
|
||||
|
||||
// OpenAI server imports
|
||||
import type { OpenAIWire } from './openai/openai.wiretypes';
|
||||
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
|
||||
|
||||
|
||||
/**
|
||||
* Event stream formats
|
||||
* - 'sse' is the default format, and is used by all vendors except Ollama
|
||||
* - 'json-nl' is used by Ollama
|
||||
*/
|
||||
type MuxingFormat = 'sse' | 'json-nl';
|
||||
|
||||
|
||||
/**
|
||||
@@ -20,49 +38,58 @@ import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
|
||||
* The peculiarity of our parser is the injection of a JSON structure at the beginning of the stream, to
|
||||
* communicate parameters before the text starts flowing to the client.
|
||||
*/
|
||||
export type AIStreamParser = (data: string) => { text: string, close: boolean };
|
||||
|
||||
type EventStreamFormat = 'sse' | 'json-nl';
|
||||
type AIStreamParser = (data: string) => { text: string, close: boolean };
|
||||
|
||||
|
||||
const chatStreamInputSchema = z.object({
|
||||
access: z.union([anthropicAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
|
||||
model: openAIModelSchema, history: openAIHistorySchema,
|
||||
const chatStreamingInputSchema = z.object({
|
||||
access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
|
||||
model: openAIModelSchema,
|
||||
history: openAIHistorySchema,
|
||||
});
|
||||
export type ChatStreamInputSchema = z.infer<typeof chatStreamInputSchema>;
|
||||
export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;
|
||||
|
||||
const chatStreamFirstPacketSchema = z.object({
|
||||
const chatStreamingFirstOutputPacketSchema = z.object({
|
||||
model: z.string(),
|
||||
});
|
||||
export type ChatStreamFirstPacketSchema = z.infer<typeof chatStreamFirstPacketSchema>;
|
||||
export type ChatStreamingFirstOutputPacketSchema = z.infer<typeof chatStreamingFirstOutputPacketSchema>;
|
||||
|
||||
|
||||
export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Response> {
|
||||
export async function llmStreamingRelayHandler(req: NextRequest): Promise<Response> {
|
||||
|
||||
// inputs - reuse the tRPC schema
|
||||
const { access, model, history } = chatStreamInputSchema.parse(await req.json());
|
||||
const body = await req.json();
|
||||
const { access, model, history } = chatStreamingInputSchema.parse(body);
|
||||
|
||||
// begin event streaming from the OpenAI API
|
||||
let headersUrl: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
|
||||
// access/dialect dependent setup:
|
||||
// - requestAccess: the headers and URL to use for the upstream API call
|
||||
// - muxingFormat: the format of the event stream (sse or json-nl)
|
||||
// - vendorStreamParser: the parser to use for the event stream
|
||||
let upstreamResponse: Response;
|
||||
let requestAccess: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
|
||||
let muxingFormat: MuxingFormat = 'sse';
|
||||
let vendorStreamParser: AIStreamParser;
|
||||
let eventStreamFormat: EventStreamFormat = 'sse';
|
||||
try {
|
||||
|
||||
// prepare the API request data
|
||||
let body: object;
|
||||
switch (access.dialect) {
|
||||
case 'anthropic':
|
||||
headersUrl = anthropicAccess(access, '/v1/complete');
|
||||
requestAccess = anthropicAccess(access, '/v1/complete');
|
||||
body = anthropicChatCompletionPayload(model, history, true);
|
||||
vendorStreamParser = createAnthropicStreamParser();
|
||||
vendorStreamParser = createStreamParserAnthropic();
|
||||
break;
|
||||
|
||||
case 'gemini':
|
||||
requestAccess = geminiAccess(access, model.id, geminiModelsStreamGenerateContentPath);
|
||||
body = geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1);
|
||||
vendorStreamParser = createStreamParserGemini(model.id.replace('models/', ''));
|
||||
break;
|
||||
|
||||
case 'ollama':
|
||||
headersUrl = ollamaAccess(access, OLLAMA_PATH_CHAT);
|
||||
requestAccess = ollamaAccess(access, OLLAMA_PATH_CHAT);
|
||||
body = ollamaChatCompletionPayload(model, history, true);
|
||||
eventStreamFormat = 'json-nl';
|
||||
vendorStreamParser = createOllamaChatCompletionStreamParser();
|
||||
muxingFormat = 'json-nl';
|
||||
vendorStreamParser = createStreamParserOllama();
|
||||
break;
|
||||
|
||||
case 'azure':
|
||||
@@ -71,27 +98,27 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
|
||||
case 'oobabooga':
|
||||
case 'openai':
|
||||
case 'openrouter':
|
||||
headersUrl = openAIAccess(access, model.id, '/v1/chat/completions');
|
||||
requestAccess = openAIAccess(access, model.id, '/v1/chat/completions');
|
||||
body = openAIChatCompletionPayload(model, history, null, null, 1, true);
|
||||
vendorStreamParser = createOpenAIStreamParser();
|
||||
vendorStreamParser = createStreamParserOpenAI();
|
||||
break;
|
||||
}
|
||||
|
||||
if (SERVER_DEBUG_WIRE)
|
||||
console.log('-> streaming:', debugGenerateCurlCommand('POST', headersUrl.url, headersUrl.headers, body));
|
||||
console.log('-> streaming:', debugGenerateCurlCommand('POST', requestAccess.url, requestAccess.headers, body));
|
||||
|
||||
// POST to our API route
|
||||
upstreamResponse = await serverFetchOrThrow(headersUrl.url, 'POST', headersUrl.headers, body);
|
||||
upstreamResponse = await serverFetchOrThrow(requestAccess.url, 'POST', requestAccess.headers, body);
|
||||
|
||||
} catch (error: any) {
|
||||
const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + error.cause : '');
|
||||
|
||||
// server-side admins message
|
||||
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, headersUrl?.url);
|
||||
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, requestAccess?.url);
|
||||
|
||||
// client-side users visible message
|
||||
return new NextResponse(`[Issue] ${access.dialect}: ${fetchOrVendorError}`
|
||||
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${headersUrl?.url}]` : ''), { status: 500 });
|
||||
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${requestAccess?.url}]` : ''), { status: 500 });
|
||||
}
|
||||
|
||||
/* The following code is heavily inspired by the Vercel AI SDK, but simplified to our needs and in full control.
|
||||
@@ -103,8 +130,12 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
|
||||
* NOTE: we have not benchmarked to see if there is performance impact by using this approach - we do want to have
|
||||
* a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
|
||||
*/
|
||||
const chatResponseStream = (upstreamResponse.body || createEmptyReadableStream())
|
||||
.pipeThrough(createEventStreamTransformer(vendorStreamParser, eventStreamFormat, access.dialect));
|
||||
const transformUpstreamToBigAgiClient = createEventStreamTransformer(
|
||||
muxingFormat, vendorStreamParser, access.dialect,
|
||||
);
|
||||
const chatResponseStream =
|
||||
(upstreamResponse.body || createEmptyReadableStream())
|
||||
.pipeThrough(transformUpstreamToBigAgiClient);
|
||||
|
||||
return new NextResponse(chatResponseStream, {
|
||||
status: 200,
|
||||
@@ -115,114 +146,44 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
|
||||
}
|
||||
|
||||
|
||||
/// Event Parsers
|
||||
|
||||
function createAnthropicStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: AnthropicWire.Complete.Response = JSON.parse(data);
|
||||
let text = json.completion;
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createOllamaChatCompletionStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
// parse the JSON chunk
|
||||
let wireJsonChunk: any;
|
||||
try {
|
||||
wireJsonChunk = JSON.parse(data);
|
||||
} catch (error: any) {
|
||||
// log the malformed data to the console, and rethrow to transmit as 'error'
|
||||
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
|
||||
throw error;
|
||||
}
|
||||
|
||||
// validate chunk
|
||||
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
|
||||
|
||||
// pass through errors from Ollama
|
||||
if ('error' in chunk)
|
||||
throw new Error(chunk.error);
|
||||
|
||||
// process output
|
||||
let text = chunk.message?.content || /*chunk.response ||*/ '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun && chunk.model) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: chunk.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: chunk.done };
|
||||
};
|
||||
}
|
||||
|
||||
function createOpenAIStreamParser(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
let hasWarned = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
|
||||
|
||||
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
|
||||
if (json.error)
|
||||
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
|
||||
|
||||
// [OpenAI] if there's a warning, log it once
|
||||
if (json.warning && !hasWarned) {
|
||||
hasWarned = true;
|
||||
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
|
||||
}
|
||||
|
||||
if (json.choices.length !== 1) {
|
||||
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
|
||||
if (json.id === '' && json.object === '' && json.model === '')
|
||||
return { text: '', close: false };
|
||||
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
|
||||
}
|
||||
|
||||
const index = json.choices[0].index;
|
||||
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
|
||||
throw new Error(`Expected completion index 0, got ${index}`);
|
||||
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
|
||||
const close = !!json.choices[0].finish_reason;
|
||||
return { text, close };
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
// Event Stream Transformers
|
||||
|
||||
/**
|
||||
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
|
||||
* Ollama is the only vendor that uses this format.
|
||||
*/
|
||||
function createDemuxerJsonNewline(onParse: EventSourceParseCallback): EventSourceParser {
|
||||
let accumulator: string = '';
|
||||
return {
|
||||
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
|
||||
feed: (chunk: string): void => {
|
||||
accumulator += chunk;
|
||||
if (accumulator.endsWith('\n')) {
|
||||
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
|
||||
const mimicEvent: ParsedEvent = {
|
||||
type: 'event',
|
||||
id: undefined,
|
||||
event: undefined,
|
||||
data: jsonString,
|
||||
};
|
||||
onParse(mimicEvent);
|
||||
}
|
||||
accumulator = '';
|
||||
}
|
||||
},
|
||||
|
||||
// resets the parser state - not useful with our driving of the parser
|
||||
reset: (): void => {
|
||||
console.error('createDemuxerJsonNewline.reset() not implemented');
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a TransformStream that parses events from an EventSource stream using a custom parser.
|
||||
* @returns {TransformStream<Uint8Array, string>} TransformStream parsing events.
|
||||
*/
|
||||
function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFormat: EventStreamFormat, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
|
||||
function createEventStreamTransformer(muxingFormat: MuxingFormat, vendorTextParser: AIStreamParser, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
|
||||
const textDecoder = new TextDecoder();
|
||||
const textEncoder = new TextEncoder();
|
||||
let eventSourceParser: EventSourceParser;
|
||||
@@ -265,10 +226,10 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
|
||||
}
|
||||
};
|
||||
|
||||
if (inputFormat === 'sse')
|
||||
if (muxingFormat === 'sse')
|
||||
eventSourceParser = createEventsourceParser(onNewEvent);
|
||||
else if (inputFormat === 'json-nl')
|
||||
eventSourceParser = createJsonNewlineParser(onNewEvent);
|
||||
else if (muxingFormat === 'json-nl')
|
||||
eventSourceParser = createDemuxerJsonNewline(onNewEvent);
|
||||
},
|
||||
|
||||
// stream=true is set because the data is not guaranteed to be final and un-chunked
|
||||
@@ -278,33 +239,142 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
|
||||
* Ollama is the only vendor that uses this format.
|
||||
*/
|
||||
function createJsonNewlineParser(onParse: EventSourceParseCallback): EventSourceParser {
|
||||
let accumulator: string = '';
|
||||
return {
|
||||
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
|
||||
feed: (chunk: string): void => {
|
||||
accumulator += chunk;
|
||||
if (accumulator.endsWith('\n')) {
|
||||
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
|
||||
const mimicEvent: ParsedEvent = {
|
||||
type: 'event',
|
||||
id: undefined,
|
||||
event: undefined,
|
||||
data: jsonString,
|
||||
};
|
||||
onParse(mimicEvent);
|
||||
}
|
||||
accumulator = '';
|
||||
}
|
||||
},
|
||||
|
||||
// resets the parser state - not useful with our driving of the parser
|
||||
reset: (): void => {
|
||||
console.error('createJsonNewlineParser.reset() not implemented');
|
||||
},
|
||||
/// Stream Parsers
|
||||
|
||||
function createStreamParserAnthropic(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: AnthropicWire.Complete.Response = JSON.parse(data);
|
||||
let text = json.completion;
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserGemini(modelName: string): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
// this can throw, it's catched upstream
|
||||
return (data: string) => {
|
||||
|
||||
// parse the JSON chunk
|
||||
const wireGenerationChunk = JSON.parse(data);
|
||||
const generationChunk = geminiGeneratedContentResponseSchema.parse(wireGenerationChunk);
|
||||
|
||||
// Prompt Safety Errors: pass through errors from Gemini
|
||||
if (generationChunk.promptFeedback?.blockReason) {
|
||||
const { blockReason, safetyRatings } = generationChunk.promptFeedback;
|
||||
return { text: `[Gemini Prompt Blocked] ${blockReason}: ${JSON.stringify(safetyRatings || 'Unknown Safety Ratings', null, 2)}`, close: true };
|
||||
}
|
||||
|
||||
// expect a single completion
|
||||
const singleCandidate = generationChunk.candidates?.[0] ?? null;
|
||||
if (!singleCandidate || !singleCandidate.content?.parts.length)
|
||||
throw new Error(`Gemini: expected 1 completion, got ${generationChunk.candidates?.length}`);
|
||||
|
||||
// expect a single part
|
||||
if (singleCandidate.content.parts.length !== 1 || !('text' in singleCandidate.content.parts[0]))
|
||||
throw new Error(`Gemini: expected 1 text part, got ${singleCandidate.content.parts.length}`);
|
||||
|
||||
// expect a single text in the part
|
||||
let text = singleCandidate.content.parts[0].text || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: modelName };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: false };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserOllama(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
// parse the JSON chunk
|
||||
let wireJsonChunk: any;
|
||||
try {
|
||||
wireJsonChunk = JSON.parse(data);
|
||||
} catch (error: any) {
|
||||
// log the malformed data to the console, and rethrow to transmit as 'error'
|
||||
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
|
||||
throw error;
|
||||
}
|
||||
|
||||
// validate chunk
|
||||
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
|
||||
|
||||
// pass through errors from Ollama
|
||||
if ('error' in chunk)
|
||||
throw new Error(chunk.error);
|
||||
|
||||
// process output
|
||||
let text = chunk.message?.content || /*chunk.response ||*/ '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun && chunk.model) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: chunk.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
return { text, close: chunk.done };
|
||||
};
|
||||
}
|
||||
|
||||
function createStreamParserOpenAI(): AIStreamParser {
|
||||
let hasBegun = false;
|
||||
let hasWarned = false;
|
||||
|
||||
return (data: string) => {
|
||||
|
||||
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
|
||||
|
||||
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
|
||||
if (json.error)
|
||||
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
|
||||
|
||||
// [OpenAI] if there's a warning, log it once
|
||||
if (json.warning && !hasWarned) {
|
||||
hasWarned = true;
|
||||
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
|
||||
}
|
||||
|
||||
if (json.choices.length !== 1) {
|
||||
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
|
||||
if (json.id === '' && json.object === '' && json.model === '')
|
||||
return { text: '', close: false };
|
||||
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
|
||||
}
|
||||
|
||||
const index = json.choices[0].index;
|
||||
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
|
||||
throw new Error(`Expected completion index 0, got ${index}`);
|
||||
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
|
||||
|
||||
// hack: prepend the model name to the first packet
|
||||
if (!hasBegun) {
|
||||
hasBegun = true;
|
||||
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
|
||||
text = JSON.stringify(firstPacket) + text;
|
||||
}
|
||||
|
||||
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
|
||||
const close = !!json.choices[0].finish_reason;
|
||||
return { text, close };
|
||||
};
|
||||
}
|
||||
+11
-1
@@ -1,11 +1,18 @@
|
||||
import { z } from 'zod';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../store-llms';
|
||||
|
||||
|
||||
// Model Description: a superset of LLM model descriptors
|
||||
|
||||
const pricingSchema = z.object({
|
||||
cpmPrompt: z.number().optional(), // Cost per thousand prompt tokens
|
||||
cpmCompletion: z.number().optional(), // Cost per thousand completion tokens
|
||||
});
|
||||
|
||||
// const rateLimitsSchema = z.object({
|
||||
// reqPerMinute: z.number().optional(),
|
||||
// });
|
||||
|
||||
const modelDescriptionSchema = z.object({
|
||||
id: z.string(),
|
||||
label: z.string(),
|
||||
@@ -15,9 +22,12 @@ const modelDescriptionSchema = z.object({
|
||||
contextWindow: z.number(),
|
||||
maxCompletionTokens: z.number().optional(),
|
||||
pricing: pricingSchema.optional(),
|
||||
// rateLimits: rateLimitsSchema.optional(),
|
||||
interfaces: z.array(z.enum([LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Complete, LLM_IF_OAI_Vision])),
|
||||
hidden: z.boolean().optional(),
|
||||
});
|
||||
|
||||
// this is also used by the Client
|
||||
export type ModelDescriptionSchema = z.infer<typeof modelDescriptionSchema>;
|
||||
|
||||
export const listModelsOutputSchema = z.object({
|
||||
+55
-50
@@ -6,54 +6,59 @@
|
||||
* from: https://ollama.ai/library?sort=featured
|
||||
*/
|
||||
export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
|
||||
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 2353, added: '20231129' },
|
||||
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 3089, added: '20231129' },
|
||||
'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 70300 },
|
||||
'yi': { description: 'A high-performing, bilingual base model.', pulls: 2673 },
|
||||
'llama2': { description: 'The most popular model for general use.', pulls: 141000 },
|
||||
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 71400 },
|
||||
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 30900 },
|
||||
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 26000 },
|
||||
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 21800 },
|
||||
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 13700 },
|
||||
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 10600 },
|
||||
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 10200 },
|
||||
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9895 },
|
||||
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9256 },
|
||||
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8827 },
|
||||
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7849 },
|
||||
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7375 },
|
||||
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 7335, added: '20231129' },
|
||||
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 6726 },
|
||||
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6272 },
|
||||
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5978 },
|
||||
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 5854, added: '20231129' },
|
||||
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5040 },
|
||||
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4648 },
|
||||
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4536 },
|
||||
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 3638 },
|
||||
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 3638 },
|
||||
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3485 },
|
||||
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 3438, added: '20231129' },
|
||||
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3145 },
|
||||
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3023 },
|
||||
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2775 },
|
||||
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2192 },
|
||||
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 1973 },
|
||||
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1915 },
|
||||
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1690 },
|
||||
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 1667, added: '20231129' },
|
||||
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1379 },
|
||||
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1345 },
|
||||
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1318, added: '20231129' },
|
||||
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1302 },
|
||||
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1254 },
|
||||
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 946, added: '20231129' },
|
||||
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 945, added: '20231210' },
|
||||
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 860 },
|
||||
'magicoder': { description: '🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.', pulls: 816, added: '20231210' },
|
||||
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 804, added: '20231129' },
|
||||
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 706 },
|
||||
'llama2': { description: 'The most popular model for general use.', pulls: 165600 },
|
||||
'mistral': { description: 'The 7B model released by Mistral AI, updated to version 0.2', pulls: 92200 },
|
||||
'llava': { description: '🌋 A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding.', pulls: 3563, added: '20231215' },
|
||||
'mixtral': { description: 'A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.', pulls: 8277, added: '20231215' },
|
||||
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 3657, added: '20231129' },
|
||||
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 4647, added: '20231129' },
|
||||
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 79800 },
|
||||
'dolphin-mixtral': { description: 'An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks. Created by Eric Hartford.', pulls: 48400, added: '20231215' },
|
||||
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 36600 },
|
||||
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 30000 },
|
||||
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 22700 },
|
||||
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 15300 },
|
||||
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 11500 },
|
||||
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 11200 },
|
||||
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 10700 },
|
||||
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 10200 },
|
||||
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9842 },
|
||||
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 9071 },
|
||||
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 8328 },
|
||||
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 8111 },
|
||||
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 7492, added: '20231129' },
|
||||
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 7468 },
|
||||
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6468 },
|
||||
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 6397 },
|
||||
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5347 },
|
||||
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 5034 },
|
||||
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4874 },
|
||||
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 4686 },
|
||||
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-1210.', pulls: 4496, added: '20231129' },
|
||||
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 4331 },
|
||||
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3722 },
|
||||
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3668 },
|
||||
'yi': { description: 'A high-performing, bilingual base model.', pulls: 3335 },
|
||||
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3219 },
|
||||
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 3087 },
|
||||
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2518 },
|
||||
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 2338 },
|
||||
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 2216, added: '20231129' },
|
||||
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 2201 },
|
||||
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 1983, added: '20231210' },
|
||||
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1790 },
|
||||
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1732, added: '20231129' },
|
||||
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1598 },
|
||||
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1534 },
|
||||
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1454 },
|
||||
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1418 },
|
||||
'phi': { description: 'Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.', pulls: 1304, added: '20231220' },
|
||||
'bakllava': { description: 'BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.', pulls: 1189, added: '20231215' },
|
||||
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 1140, added: '20231129' },
|
||||
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 1060 },
|
||||
'solar': { description: 'A compact, yet powerful 10.7B large language model designed for single-turn conversation.', pulls: 934 },
|
||||
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 902, added: '20231129' },
|
||||
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 868 },
|
||||
};
|
||||
// export const OLLAMA_LAST_UPDATE: string = '20231210';
|
||||
export const OLLAMA_PREV_UPDATE: string = '20231129';
|
||||
// export const OLLAMA_LAST_UPDATE: string = '20231220';
|
||||
export const OLLAMA_PREV_UPDATE: string = '20231210';
|
||||
+2
-2
@@ -5,12 +5,12 @@ import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
|
||||
import { env } from '~/server/env.mjs';
|
||||
import { fetchJsonOrTRPCError, fetchTextOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
|
||||
import { LLM_IF_OAI_Chat } from '../../../store-llms';
|
||||
import { LLM_IF_OAI_Chat } from '../../store-llms';
|
||||
|
||||
import { capitalizeFirstLetter } from '~/common/util/textUtils';
|
||||
|
||||
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
|
||||
import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
|
||||
import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';
|
||||
+19
-17
@@ -1,8 +1,8 @@
|
||||
import { SERVER_DEBUG_WIRE } from '~/server/wire';
|
||||
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
|
||||
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
|
||||
|
||||
import type { ModelDescriptionSchema } from '../server.schemas';
|
||||
import type { ModelDescriptionSchema } from '../llm.server.types';
|
||||
import { wireMistralModelsListOutputSchema } from './mistral.wiretypes';
|
||||
|
||||
|
||||
@@ -313,16 +313,16 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
|
||||
'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
|
||||
'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32000, cp: 0, cc: 0, unfilt: true },
|
||||
'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32000, cp: 0, cc: 0, unfilt: true },
|
||||
'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32768, cp: 0, cc: 0, unfilt: true },
|
||||
'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
|
||||
'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
|
||||
'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
|
||||
'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
|
||||
'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
|
||||
'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
|
||||
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
|
||||
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32768, cp: 0.0003, cc: 0.0003, unfilt: true },
|
||||
'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
|
||||
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
|
||||
@@ -334,10 +334,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'openai/gpt-4-32k': { name: 'OpenAI: GPT-4 32k', cw: 32767, cp: 0.06, cc: 0.12, unfilt: false },
|
||||
'openai/gpt-4-vision-preview': { name: 'OpenAI: GPT-4 Vision (preview)', cw: 128000, cp: 0.01, cc: 0.03, unfilt: false },
|
||||
'openai/gpt-3.5-turbo-instruct': { name: 'OpenAI: GPT-3.5 Turbo Instruct', cw: 4095, cp: 0.0015, cc: 0.002, unfilt: false },
|
||||
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 9216, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 36864, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 28672, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/gemini-pro': { name: 'Google: Gemini Pro (preview)', cw: 131040, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'google/gemini-pro-vision': { name: 'Google: Gemini Pro Vision (preview)', cw: 65536, cp: 0.00025, cc: 0.0005, unfilt: true },
|
||||
'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
|
||||
'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
|
||||
'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
|
||||
@@ -347,7 +349,7 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
|
||||
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
|
||||
'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
|
||||
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0001425006, cc: 0.0001425006, unfilt: true },
|
||||
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
|
||||
@@ -361,9 +363,9 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
'01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
|
||||
'01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
|
||||
'01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
|
||||
'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32000, cp: 0.0006, cc: 0.0006, unfilt: true },
|
||||
'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
|
||||
'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32768, cp: 0.0006, cc: 0.0006, unfilt: true },
|
||||
'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
|
||||
'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
|
||||
'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
|
||||
@@ -382,10 +384,10 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
|
||||
};
|
||||
|
||||
const orModelFamilyOrder = [
|
||||
// great models
|
||||
'mistralai/mixtral-8x7b-instruct', 'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
|
||||
// great models (pickes by hand, they're free)
|
||||
'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
|
||||
// great orgs
|
||||
'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/',
|
||||
'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'mistralai/', 'openai/', 'meta-llama/', 'phind/',
|
||||
];
|
||||
|
||||
export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
|
||||
+1
-1
@@ -8,7 +8,7 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
|
||||
import { Brand } from '~/common/app.config';
|
||||
|
||||
import type { OpenAIWire } from './openai.wiretypes';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
|
||||
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
|
||||
import { localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@ import { create } from 'zustand';
|
||||
import { shallow } from 'zustand/shallow';
|
||||
import { persist } from 'zustand/middleware';
|
||||
|
||||
import type { IModelVendor, ModelVendorId } from './vendors/IModelVendor';
|
||||
import type { ModelVendorId } from './vendors/vendors.registry';
|
||||
import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
|
||||
updated?: number | 0;
|
||||
description: string;
|
||||
tags: string[]; // UNUSED for now
|
||||
// modelcaps: DModelCapability[];
|
||||
contextTokens: number;
|
||||
maxOutputTokens: number;
|
||||
hidden: boolean;
|
||||
@@ -30,6 +31,17 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
|
||||
|
||||
export type DLLMId = string;
|
||||
|
||||
// export type DModelCapability =
|
||||
// | 'input-text'
|
||||
// | 'input-image-data'
|
||||
// | 'input-multipart'
|
||||
// | 'output-text'
|
||||
// | 'output-function'
|
||||
// | 'output-image-data'
|
||||
// | 'if-chat'
|
||||
// | 'if-fast-chat'
|
||||
// ;
|
||||
|
||||
// Model interfaces (chat, and function calls) - here as a preview, will be used more broadly in the future
|
||||
export const LLM_IF_OAI_Chat = 'oai-chat';
|
||||
export const LLM_IF_OAI_Vision = 'oai-vision';
|
||||
@@ -269,32 +281,3 @@ export function useChatLLM() {
|
||||
}, shallow);
|
||||
}
|
||||
|
||||
/**
|
||||
* Source-specific read/write - great time saver
|
||||
*/
|
||||
export function useSourceSetup<TSourceSetup, TAccess>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess>) {
|
||||
|
||||
// invalidates only when the setup changes
|
||||
const { updateSourceSetup, ...rest } = useModelsStore(state => {
|
||||
|
||||
// find the source (or null)
|
||||
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
|
||||
|
||||
// (safe) source-derived properties
|
||||
const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
|
||||
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
|
||||
const access = vendor.getTransportAccess(source?.setup);
|
||||
|
||||
return {
|
||||
source,
|
||||
access,
|
||||
sourceHasLLMs: !!sourceLLMs.length,
|
||||
sourceSetupValid,
|
||||
updateSourceSetup: state.updateSourceSetup,
|
||||
};
|
||||
}, shallow);
|
||||
|
||||
// convenience function for this source
|
||||
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
|
||||
return { ...rest, updateSetup };
|
||||
}
|
||||
@@ -1,34 +0,0 @@
|
||||
import type { DLLMId } from '../store-llms';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import { findVendorForLlmOrThrow } from '../vendors/vendors.registry';
|
||||
|
||||
|
||||
export interface VChatMessageIn {
|
||||
role: 'assistant' | 'system' | 'user'; // | 'function';
|
||||
content: string;
|
||||
//name?: string; // when role: 'function'
|
||||
}
|
||||
|
||||
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
|
||||
|
||||
export interface VChatMessageOut {
|
||||
role: 'assistant' | 'system' | 'user';
|
||||
content: string;
|
||||
finish_reason: 'stop' | 'length' | null;
|
||||
}
|
||||
|
||||
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
|
||||
function_name: string;
|
||||
function_arguments: object | null;
|
||||
}
|
||||
|
||||
|
||||
export async function callChatGenerate(llmId: DLLMId, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
return await vendor.callChatGenerate(llm, messages, maxTokens);
|
||||
}
|
||||
|
||||
export async function callChatGenerateWithFunctions(llmId: DLLMId, messages: VChatMessageIn[], functions: VChatFunctionIn[], forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
return await vendor.callChatGenerateWF(llm, messages, functions, forceFunctionName, maxTokens);
|
||||
}
|
||||
+29
-9
@@ -1,13 +1,12 @@
|
||||
import type React from 'react';
|
||||
import type { TRPCClientErrorBase } from '@trpc/client';
|
||||
|
||||
import type { DLLM, DModelSourceId } from '../store-llms';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../transports/chatGenerate';
|
||||
import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
|
||||
import type { ModelDescriptionSchema } from '../server/llm.server.types';
|
||||
import type { ModelVendorId } from './vendors.registry';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '~/modules/llms/llm.client';
|
||||
|
||||
|
||||
export type ModelVendorId = 'anthropic' | 'azure' | 'localai' | 'mistral' | 'ollama' | 'oobabooga' | 'openai' | 'openrouter';
|
||||
|
||||
export type ModelVendorRegistryType = Record<ModelVendorId, IModelVendor>;
|
||||
|
||||
export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
|
||||
readonly id: ModelVendorId;
|
||||
readonly name: string;
|
||||
@@ -30,7 +29,28 @@ export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOpt
|
||||
|
||||
getTransportAccess(setup?: Partial<TSourceSetup>): TAccess;
|
||||
|
||||
callChatGenerate(llm: TDLLM, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut>;
|
||||
rpcUpdateModelsQuery: (
|
||||
access: TAccess,
|
||||
enabled: boolean,
|
||||
onSuccess: (data: { models: ModelDescriptionSchema[] }) => void,
|
||||
) => { isFetching: boolean, refetch: () => void, isError: boolean, error: TRPCClientErrorBase<any> | null };
|
||||
|
||||
callChatGenerateWF(llm: TDLLM, messages: VChatMessageIn[], functions: null | VChatFunctionIn[], forceFunctionName: null | string, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut>;
|
||||
}
|
||||
rpcChatGenerateOrThrow: (
|
||||
access: TAccess,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
|
||||
|
||||
streamingChatGenerateOrThrow: (
|
||||
access: TAccess,
|
||||
llmId: DLLMId,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
) => Promise<void>;
|
||||
|
||||
}
|
||||
|
||||
+5
-11
@@ -7,11 +7,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidAnthropicApiKey, ModelVendorAnthropic } from './anthropic.vendor';
|
||||
|
||||
@@ -34,14 +34,8 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = anthropicKey ? keyValid : (!needsUserKey || !!anthropicHost);
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmAnthropic.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorAnthropic, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+41
-35
@@ -1,11 +1,12 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { AnthropicIcon } from '~/common/components/icons/AnthropicIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { AnthropicAccessSchema } from '../../transports/server/anthropic/anthropic.router';
|
||||
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
@@ -14,7 +15,7 @@ import { AnthropicSourceSetup } from './AnthropicSourceSetup';
|
||||
|
||||
|
||||
// special symbols
|
||||
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length > 40 : apiKey.length >= 40);
|
||||
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length >= 39 : apiKey.length >= 40);
|
||||
|
||||
export interface SourceSetupAnthropic {
|
||||
anthropicKey: string;
|
||||
@@ -42,37 +43,42 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicA
|
||||
anthropicHost: partialSetup?.anthropicHost || null,
|
||||
heliconeKey: partialSetup?.heliconeKey || null,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return anthropicCallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, /*null, null,*/ maxTokens);
|
||||
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmAnthropic.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
|
||||
throw new Error('Anthropic does not support "Functions" yet');
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Anthropic does not support functions');
|
||||
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmAnthropic.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
|
||||
console.error(`anthropic.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or function calls, or throws a descriptive error string
|
||||
*/
|
||||
async function anthropicCallChatGenerate<TOut = VChatMessageOut>(
|
||||
access: AnthropicAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
// functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmAnthropic.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
|
||||
console.error(`anthropicCallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
+5
-11
@@ -5,11 +5,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { asValidURL } from '~/common/util/urlUtils';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidAzureApiKey, ModelVendorAzure } from './azure.vendor';
|
||||
|
||||
@@ -31,14 +31,8 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = azureKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorAzure, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+7
-9
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
|
||||
import { AzureIcon } from '~/common/components/icons/AzureIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { AzureSourceSetup } from './AzureSourceSetup';
|
||||
@@ -58,10 +57,9 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, OpenAIAccessSchema
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport ('azure' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
@@ -0,0 +1,96 @@
|
||||
import * as React from 'react';
|
||||
|
||||
import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
|
||||
import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';
|
||||
|
||||
import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
|
||||
import type { DModelSourceId } from '../../store-llms';
|
||||
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorGemini } from './gemini.vendor';
|
||||
|
||||
|
||||
const GEMINI_API_KEY_LINK = 'https://makersuite.google.com/app/apikey';
|
||||
|
||||
const SAFETY_OPTIONS: { value: GeminiBlockSafetyLevel, label: string }[] = [
|
||||
{ value: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED', label: 'Default' },
|
||||
{ value: 'BLOCK_LOW_AND_ABOVE', label: 'Low and above' },
|
||||
{ value: 'BLOCK_MEDIUM_AND_ABOVE', label: 'Medium and above' },
|
||||
{ value: 'BLOCK_ONLY_HIGH', label: 'Only high' },
|
||||
{ value: 'BLOCK_NONE', label: 'None' },
|
||||
];
|
||||
|
||||
|
||||
export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceSetupValid, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorGemini);
|
||||
|
||||
// derived state
|
||||
const { geminiKey, minSafetyLevel } = access;
|
||||
|
||||
const needsUserKey = !ModelVendorGemini.hasBackendCap?.();
|
||||
const shallFetchSucceed = !needsUserKey || (!!geminiKey && sourceSetupValid);
|
||||
const showKeyError = !!geminiKey && !sourceSetupValid;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorGemini, access, shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
<FormInputKey
|
||||
id='gemini-key' label='Gemini API Key'
|
||||
rightLabel={<>{needsUserKey
|
||||
? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
|
||||
: '✔️ already set in server'}
|
||||
</>}
|
||||
value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
|
||||
required={needsUserKey} isError={showKeyError}
|
||||
placeholder='...'
|
||||
/>
|
||||
|
||||
<FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
|
||||
<FormLabelStart title='Safety Settings'
|
||||
description='Threshold' />
|
||||
<Select
|
||||
variant='outlined'
|
||||
value={minSafetyLevel} onChange={(_event, value) => value && updateSetup({ minSafetyLevel: value })}
|
||||
startDecorator={<HealthAndSafetyIcon sx={{ display: { xs: 'none', sm: 'inherit' } }} />}
|
||||
// indicator={<KeyboardArrowDownIcon />}
|
||||
slotProps={{
|
||||
root: { sx: { width: '100%' } },
|
||||
indicator: { sx: { opacity: 0.5 } },
|
||||
button: { sx: { whiteSpace: 'inherit' } },
|
||||
}}
|
||||
>
|
||||
{SAFETY_OPTIONS.map(option => (
|
||||
<Option key={'gemini-safety-' + option.value} value={option.value}>{option.label}</Option>
|
||||
))}
|
||||
</Select>
|
||||
</FormControl>
|
||||
|
||||
<FormHelperText sx={{ display: 'block' }}>
|
||||
Gemini has <Link href='https://ai.google.dev/docs/safety_setting_gemini' target='_blank' noLinkStyle>
|
||||
adjustable safety settings</Link> on four categories: Harassment, Hate speech,
|
||||
Sexually explicit, and Dangerous content, in addition to non-adjustable built-in filters.
|
||||
By default, the model will block content with <em>medium and above</em> probability
|
||||
of being unsafe.
|
||||
</FormHelperText>
|
||||
|
||||
<SetupFormRefetchButton
|
||||
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
|
||||
/>
|
||||
|
||||
{isError && <InlineError error={error} />}
|
||||
|
||||
</>;
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
import GoogleIcon from '@mui/icons-material/Google';
|
||||
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
|
||||
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { GeminiSourceSetup } from './GeminiSourceSetup';
|
||||
|
||||
|
||||
export interface SourceSetupGemini {
|
||||
geminiKey: string;
|
||||
minSafetyLevel: GeminiBlockSafetyLevel;
|
||||
}
|
||||
|
||||
export interface LLMOptionsGemini {
|
||||
llmRef: string;
|
||||
stopSequences: string[]; // up to 5 sequences that will stop generation (optional)
|
||||
candidateCount: number; // 1...8 number of generated responses to return (optional)
|
||||
maxOutputTokens: number; // if unset, this will default to outputTokenLimit (optional)
|
||||
temperature: number; // 0...1 Controls the randomness of the output. (optional)
|
||||
topP: number; // 0...1 The maximum cumulative probability of tokens to consider when sampling (optional)
|
||||
topK: number; // 1...100 The maximum number of tokens to consider when sampling (optional)
|
||||
}
|
||||
|
||||
|
||||
export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSchema, LLMOptionsGemini> = {
|
||||
id: 'googleai',
|
||||
name: 'Gemini',
|
||||
rank: 11,
|
||||
location: 'cloud',
|
||||
instanceLimit: 1,
|
||||
hasBackendCap: () => backendCaps().hasLlmGemini,
|
||||
|
||||
// components
|
||||
Icon: GoogleIcon,
|
||||
SourceSetupComponent: GeminiSourceSetup,
|
||||
LLMOptionsComponent: OpenAILLMOptions,
|
||||
|
||||
// functions
|
||||
initializeSetup: () => ({
|
||||
geminiKey: '',
|
||||
minSafetyLevel: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
}),
|
||||
validateSetup: (setup) => {
|
||||
return setup.geminiKey?.length > 0;
|
||||
},
|
||||
getTransportAccess: (partialSetup): GeminiAccessSchema => ({
|
||||
dialect: 'gemini',
|
||||
geminiKey: partialSetup?.geminiKey || '',
|
||||
minSafetyLevel: partialSetup?.minSafetyLevel || 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
|
||||
}),
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmGemini.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Gemini does not support functions');
|
||||
|
||||
const { llmRef, temperature = 0.5, maxOutputTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmGemini.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: temperature,
|
||||
maxTokens: maxTokens || maxOutputTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
|
||||
console.error(`gemini.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
+5
-11
@@ -7,10 +7,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorLocalAI } from './localai.vendor';
|
||||
|
||||
@@ -30,14 +30,8 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = isValidHost;
|
||||
|
||||
// fetch models - the OpenAI way
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorLocalAI, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+8
-10
@@ -1,10 +1,9 @@
|
||||
import DevicesIcon from '@mui/icons-material/Devices';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { LocalAISourceSetup } from './LocalAISourceSetup';
|
||||
@@ -38,10 +37,9 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, OpenAIAccessSc
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
};
|
||||
|
||||
// OpenAI transport ('localai' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
|
||||
+6
-12
@@ -4,10 +4,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorMistral } from './mistral.vendor';
|
||||
|
||||
@@ -18,7 +18,7 @@ const MISTRAL_REG_LINK = 'https://console.mistral.ai/';
|
||||
export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
// external state
|
||||
const { source, sourceSetupValid, sourceHasLLMs, access, updateSetup } =
|
||||
const { source, sourceSetupValid, access, updateSetup } =
|
||||
useSourceSetup(props.sourceId, ModelVendorMistral);
|
||||
|
||||
// derived state
|
||||
@@ -29,14 +29,8 @@ export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const showKeyError = !!mistralKey && !sourceSetupValid;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: false,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorMistral, access, shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+7
-9
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
|
||||
import { MistralIcon } from '~/common/components/icons/MistralIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatMessageIn, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate, SourceSetupOpenAI } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI, SourceSetupOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { MistralSourceSetup } from './MistralSourceSetup';
|
||||
@@ -48,10 +47,9 @@ export const ModelVendorMistral: IModelVendor<SourceSetupMistral, OpenAIAccessSc
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF() {
|
||||
throw new Error('Mistral does not support "Functions" yet');
|
||||
},
|
||||
|
||||
// OpenAI transport ('mistral' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
@@ -12,7 +12,7 @@ import { Link } from '~/common/components/Link';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { settingsGap } from '~/common/app.theme';
|
||||
|
||||
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
|
||||
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
|
||||
|
||||
|
||||
export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {
|
||||
@@ -68,7 +68,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
|
||||
>
|
||||
{pullable.map(p =>
|
||||
<Option key={p.id} value={p.id}>
|
||||
{p.isNew === true && <Chip size='sm' variant='outlined'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
|
||||
{p.isNew === true && <Chip size='sm' variant='solid'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
|
||||
</Option>,
|
||||
)}
|
||||
</Select>
|
||||
@@ -118,7 +118,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
|
||||
{pullModelDescription}
|
||||
</Typography>
|
||||
|
||||
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1 }}>
|
||||
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1, alignItems: 'start' }}>
|
||||
<Button
|
||||
variant='outlined'
|
||||
color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
|
||||
|
||||
+6
-11
@@ -6,13 +6,14 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { asValidURL } from '~/common/util/urlUtils';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorOllama } from './ollama.vendor';
|
||||
import { OllamaAdministration } from './OllamaAdministration';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
|
||||
|
||||
export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
@@ -32,14 +33,8 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = !hostError;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOllama.listModels.useQuery({ access }, {
|
||||
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOllama, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+40
-34
@@ -1,13 +1,14 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { OllamaIcon } from '~/common/components/icons/OllamaIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
|
||||
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
|
||||
import type { VChatMessageOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import type { LLMOptionsOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OllamaSourceSetup } from './OllamaSourceSetup';
|
||||
@@ -36,36 +37,41 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSche
|
||||
dialect: 'ollama',
|
||||
ollamaHost: partialSetup?.ollamaHost || '',
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return ollamaCallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, maxTokens);
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmOllama.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
|
||||
throw new Error('Ollama does not support "Functions" yet');
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
if (functions?.length || forceFunctionName)
|
||||
throw new Error('Ollama does not support functions');
|
||||
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOllama.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as VChatMessageOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
|
||||
console.error(`ollama.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or throws a descriptive error string
|
||||
*/
|
||||
async function ollamaCallChatGenerate<TOut = VChatMessageOut>(
|
||||
access: OllamaAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOllama.chatGenerate.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
|
||||
console.error(`ollamaCallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
|
||||
+5
-11
@@ -6,10 +6,10 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { ModelVendorOoobabooga } from './oobabooga.vendor';
|
||||
|
||||
@@ -24,14 +24,8 @@ export function OobaboogaSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const { oaiHost } = access;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: false, // !hasModels && !!asValidURL(normSetup.oaiHost),
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOoobabooga, access, false /* !hasModels && !!asValidURL(normSetup.oaiHost) */, source);
|
||||
|
||||
return <>
|
||||
|
||||
|
||||
+7
-9
@@ -1,10 +1,9 @@
|
||||
import { OobaboogaIcon } from '~/common/components/icons/OobaboogaIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OobaboogaSourceSetup } from './OobaboogaSourceSetup';
|
||||
@@ -38,10 +37,9 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, OpenAIAcc
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport (oobabooga dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
+6
-40
@@ -9,13 +9,13 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
|
||||
|
||||
import type { ModelDescriptionSchema } from '../../transports/server/server.schemas';
|
||||
import { DLLM, DModelSource, DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidOpenAIApiKey, LLMOptionsOpenAI, ModelVendorOpenAI } from './openai.vendor';
|
||||
import { isValidOpenAIApiKey, ModelVendorOpenAI } from './openai.vendor';
|
||||
|
||||
|
||||
// avoid repeating it all over
|
||||
@@ -40,15 +40,8 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOpenAI, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
return <>
|
||||
|
||||
@@ -110,30 +103,3 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
|
||||
</>;
|
||||
}
|
||||
|
||||
|
||||
export function modelDescriptionToDLLM<TSourceSetup>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, LLMOptionsOpenAI> {
|
||||
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
|
||||
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
|
||||
return {
|
||||
id: `${source.id}-${model.id}`,
|
||||
|
||||
label: model.label,
|
||||
created: model.created || 0,
|
||||
updated: model.updated || 0,
|
||||
description: model.description,
|
||||
tags: [], // ['stream', 'chat'],
|
||||
contextTokens: model.contextWindow,
|
||||
maxOutputTokens: maxOutputTokens,
|
||||
hidden: !!model.hidden,
|
||||
|
||||
sId: source.id,
|
||||
_source: source,
|
||||
|
||||
options: {
|
||||
llmRef: model.id,
|
||||
llmTemperature: 0.5,
|
||||
llmResponseTokens: llmResponseTokens,
|
||||
},
|
||||
};
|
||||
}
|
||||
+38
-38
@@ -1,11 +1,12 @@
|
||||
import { backendCaps } from '~/modules/backend/state-backend';
|
||||
|
||||
import { OpenAIIcon } from '~/common/components/icons/OpenAIIcon';
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
|
||||
import { unifiedStreamingClient } from '../unifiedStreamingClient';
|
||||
|
||||
import { OpenAILLMOptions } from './OpenAILLMOptions';
|
||||
import { OpenAISourceSetup } from './OpenAISourceSetup';
|
||||
@@ -51,41 +52,40 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSche
|
||||
moderationCheck: false,
|
||||
...partialSetup,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
const access = this.getTransportAccess(llm._source.setup);
|
||||
return openAICallChatGenerate(access, llm.options, messages, null, null, maxTokens);
|
||||
|
||||
// List Models
|
||||
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
|
||||
return apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: enabled,
|
||||
onSuccess: onSuccess,
|
||||
refetchOnWindowFocus: false,
|
||||
staleTime: Infinity,
|
||||
});
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
const access = this.getTransportAccess(llm._source.setup);
|
||||
return openAICallChatGenerate(access, llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
|
||||
// Chat Generate (non-streaming) with Functions
|
||||
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
functions: functions ?? undefined,
|
||||
forceFunctionName: forceFunctionName ?? undefined,
|
||||
history: messages,
|
||||
}) as VChatMessageOrFunctionCallOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
|
||||
console.error(`openai.rpcChatGenerateOrThrow: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
},
|
||||
|
||||
// Chat Generate (streaming) with Functions
|
||||
streamingChatGenerateOrThrow: unifiedStreamingClient,
|
||||
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* This function either returns the LLM message, or function calls, or throws a descriptive error string
|
||||
*/
|
||||
export async function openAICallChatGenerate<TOut = VChatMessageOut | VChatMessageOrFunctionCallOut>(
|
||||
access: OpenAIAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
maxTokens?: number,
|
||||
): Promise<TOut> {
|
||||
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
|
||||
try {
|
||||
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
|
||||
access,
|
||||
model: {
|
||||
id: llmRef!,
|
||||
temperature: llmTemperature,
|
||||
maxTokens: maxTokens || llmResponseTokens || 1024,
|
||||
},
|
||||
functions: functions ?? undefined,
|
||||
forceFunctionName: forceFunctionName ?? undefined,
|
||||
history: messages,
|
||||
}) as TOut;
|
||||
} catch (error: any) {
|
||||
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
|
||||
console.error(`openAICallChatGenerate: ${errorMessage}`);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
@@ -6,11 +6,11 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
|
||||
import { InlineError } from '~/common/components/InlineError';
|
||||
import { Link } from '~/common/components/Link';
|
||||
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
|
||||
import { apiQuery } from '~/common/util/trpc.client';
|
||||
import { getCallbackUrl } from '~/common/app.routes';
|
||||
|
||||
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
|
||||
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
|
||||
import { DModelSourceId } from '../../store-llms';
|
||||
import { useLlmUpdateModels } from '../useLlmUpdateModels';
|
||||
import { useSourceSetup } from '../useSourceSetup';
|
||||
|
||||
import { isValidOpenRouterKey, ModelVendorOpenRouter } from './openrouter.vendor';
|
||||
|
||||
@@ -30,14 +30,8 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
|
||||
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
|
||||
|
||||
// fetch models
|
||||
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
|
||||
enabled: !sourceHasLLMs && shallFetchSucceed,
|
||||
onSuccess: models => source && useModelsStore.getState().setLLMs(
|
||||
models.models.map(model => modelDescriptionToDLLM(model, source)),
|
||||
props.sourceId,
|
||||
),
|
||||
staleTime: Infinity,
|
||||
});
|
||||
const { isFetching, refetch, isError, error } =
|
||||
useLlmUpdateModels(ModelVendorOpenRouter, access, !sourceHasLLMs && shallFetchSucceed, source);
|
||||
|
||||
|
||||
const handleOpenRouterLogin = () => {
|
||||
|
||||
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
|
||||
import { OpenRouterIcon } from '~/common/components/icons/OpenRouterIcon';
|
||||
|
||||
import type { IModelVendor } from '../IModelVendor';
|
||||
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
|
||||
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
|
||||
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
|
||||
|
||||
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
|
||||
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
|
||||
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
|
||||
|
||||
import { OpenRouterSourceSetup } from './OpenRouterSourceSetup';
|
||||
@@ -59,10 +58,9 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, OpenAIAc
|
||||
heliKey: '',
|
||||
moderationCheck: false,
|
||||
}),
|
||||
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
|
||||
},
|
||||
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
|
||||
return openAICallChatGenerate(this.getTransportAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
|
||||
},
|
||||
|
||||
// OpenAI transport ('openrouter' dialect in 'access')
|
||||
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
|
||||
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
|
||||
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
|
||||
};
|
||||
Vendored
+13
-27
@@ -1,11 +1,10 @@
|
||||
import { apiAsync } from '~/common/util/trpc.client';
|
||||
|
||||
import type { DLLM, DLLMId } from '../store-llms';
|
||||
import { findVendorForLlmOrThrow } from '../vendors/vendors.registry';
|
||||
import type { ChatStreamingFirstOutputPacketSchema, ChatStreamingInputSchema } from '../server/llm.server.streaming';
|
||||
import type { DLLMId } from '../store-llms';
|
||||
import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';
|
||||
|
||||
import type { ChatStreamFirstPacketSchema, ChatStreamInputSchema } from './server/openai/openai.streaming';
|
||||
import type { OpenAIWire } from './server/openai/openai.wiretypes';
|
||||
import type { VChatMessageIn } from './chatGenerate';
|
||||
import type { OpenAIWire } from '../server/openai/openai.wiretypes';
|
||||
|
||||
|
||||
/**
|
||||
@@ -15,27 +14,14 @@ import type { VChatMessageIn } from './chatGenerate';
|
||||
* Vendor-specific implementation is on our server backend (API) code. This function tries to be
|
||||
* as generic as possible.
|
||||
*
|
||||
* @param llmId LLM to use
|
||||
* @param messages the history of messages to send to the API endpoint
|
||||
* @param abortSignal used to initiate a client-side abort of the fetch request to the API endpoint
|
||||
* @param onUpdate callback when a piece of a message (text, model name, typing..) is received
|
||||
* NOTE: onUpdate is callback when a piece of a message (text, model name, typing..) is received
|
||||
*/
|
||||
export async function streamChat(
|
||||
export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
access: ChatStreamingInputSchema['access'],
|
||||
llmId: DLLMId,
|
||||
llmOptions: TLLMOptions,
|
||||
messages: VChatMessageIn[],
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
): Promise<void> {
|
||||
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
|
||||
const access = vendor.getTransportAccess(llm._source.setup) as ChatStreamInputSchema['access'];
|
||||
return await vendorStreamChat(access, llm, messages, abortSignal, onUpdate);
|
||||
}
|
||||
|
||||
|
||||
async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
access: ChatStreamInputSchema['access'],
|
||||
llm: DLLM<TSourceSetup, TLLMOptions>,
|
||||
messages: VChatMessageIn[],
|
||||
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
|
||||
abortSignal: AbortSignal,
|
||||
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
|
||||
) {
|
||||
@@ -79,12 +65,12 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
}
|
||||
|
||||
// model params (llm)
|
||||
const { llmRef, llmTemperature, llmResponseTokens } = (llm.options as any) || {};
|
||||
const { llmRef, llmTemperature, llmResponseTokens } = (llmOptions as any) || {};
|
||||
if (!llmRef || llmTemperature === undefined || llmResponseTokens === undefined)
|
||||
throw new Error(`Error in configuration for model ${llm.id}: ${JSON.stringify(llm.options)}`);
|
||||
throw new Error(`Error in configuration for model ${llmId}: ${JSON.stringify(llmOptions)}`);
|
||||
|
||||
// prepare the input, similarly to the tRPC openAI.chatGenerate
|
||||
const input: ChatStreamInputSchema = {
|
||||
const input: ChatStreamingInputSchema = {
|
||||
access,
|
||||
model: {
|
||||
id: llmRef,
|
||||
@@ -131,7 +117,7 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
|
||||
incrementalText = incrementalText.substring(endOfJson + 1);
|
||||
parsedFirstPacket = true;
|
||||
try {
|
||||
const parsed: ChatStreamFirstPacketSchema = JSON.parse(json);
|
||||
const parsed: ChatStreamingFirstOutputPacketSchema = JSON.parse(json);
|
||||
onUpdate({ originLLM: parsed.model }, false);
|
||||
} catch (e) {
|
||||
// error parsing JSON, ignore
|
||||
+47
@@ -0,0 +1,47 @@
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import type { ModelDescriptionSchema } from '../server/llm.server.types';
|
||||
import { DLLM, DModelSource, useModelsStore } from '../store-llms';
|
||||
|
||||
|
||||
/**
|
||||
* Hook that fetches the list of models from the vendor and updates the store,
|
||||
* while returning the fetch state.
|
||||
*/
|
||||
export function useLlmUpdateModels<TSourceSetup, TAccess, TLLMOptions>(vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>, access: TAccess, enabled: boolean, source: DModelSource<TSourceSetup>) {
|
||||
return vendor.rpcUpdateModelsQuery(access, enabled, data => source && updateModelsFn(data, source));
|
||||
}
|
||||
|
||||
|
||||
function updateModelsFn<TSourceSetup>(data: { models: ModelDescriptionSchema[] }, source: DModelSource<TSourceSetup>) {
|
||||
useModelsStore.getState().setLLMs(
|
||||
data.models.map(model => modelDescriptionToDLLMOpenAIOptions(model, source)),
|
||||
source.id,
|
||||
);
|
||||
}
|
||||
|
||||
function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, TLLMOptions> {
|
||||
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
|
||||
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
|
||||
return {
|
||||
id: `${source.id}-${model.id}`,
|
||||
|
||||
label: model.label,
|
||||
created: model.created || 0,
|
||||
updated: model.updated || 0,
|
||||
description: model.description,
|
||||
tags: [], // ['stream', 'chat'],
|
||||
contextTokens: model.contextWindow,
|
||||
maxOutputTokens: maxOutputTokens,
|
||||
hidden: !!model.hidden,
|
||||
|
||||
sId: source.id,
|
||||
_source: source,
|
||||
|
||||
options: {
|
||||
llmRef: model.id,
|
||||
// @ts-ignore FIXME: large assumption that this is LLMOptionsOpenAI object
|
||||
llmTemperature: 0.5,
|
||||
llmResponseTokens: llmResponseTokens,
|
||||
},
|
||||
};
|
||||
}
|
||||
+35
@@ -0,0 +1,35 @@
|
||||
import { shallow } from 'zustand/shallow';
|
||||
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import { DModelSource, DModelSourceId, useModelsStore } from '../store-llms';
|
||||
|
||||
|
||||
/**
|
||||
* Source-specific read/write - great time saver
|
||||
*/
|
||||
export function useSourceSetup<TSourceSetup, TAccess, TLLMOptions>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>) {
|
||||
|
||||
// invalidates only when the setup changes
|
||||
const { updateSourceSetup, ...rest } = useModelsStore(state => {
|
||||
|
||||
// find the source (or null)
|
||||
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
|
||||
|
||||
// (safe) source-derived properties
|
||||
const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
|
||||
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
|
||||
const access = vendor.getTransportAccess(source?.setup);
|
||||
|
||||
return {
|
||||
source,
|
||||
access,
|
||||
sourceHasLLMs: !!sourceLLMs.length,
|
||||
sourceSetupValid,
|
||||
updateSourceSetup: state.updateSourceSetup,
|
||||
};
|
||||
}, shallow);
|
||||
|
||||
// convenience function for this source
|
||||
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
|
||||
return { ...rest, updateSetup };
|
||||
}
|
||||
+23
-8
@@ -1,5 +1,6 @@
|
||||
import { ModelVendorAnthropic } from './anthropic/anthropic.vendor';
|
||||
import { ModelVendorAzure } from './azure/azure.vendor';
|
||||
import { ModelVendorGemini } from './gemini/gemini.vendor';
|
||||
import { ModelVendorLocalAI } from './localai/localai.vendor';
|
||||
import { ModelVendorMistral } from './mistral/mistral.vendor';
|
||||
import { ModelVendorOllama } from './ollama/ollama.vendor';
|
||||
@@ -7,20 +8,32 @@ import { ModelVendorOoobabooga } from './oobabooga/oobabooga.vendor';
|
||||
import { ModelVendorOpenAI } from './openai/openai.vendor';
|
||||
import { ModelVendorOpenRouter } from './openrouter/openrouter.vendor';
|
||||
|
||||
import type { IModelVendor, ModelVendorId, ModelVendorRegistryType } from './IModelVendor';
|
||||
import type { IModelVendor } from './IModelVendor';
|
||||
import { DLLMId, DModelSource, DModelSourceId, findLLMOrThrow } from '../store-llms';
|
||||
|
||||
export type ModelVendorId =
|
||||
| 'anthropic'
|
||||
| 'azure'
|
||||
| 'googleai'
|
||||
| 'localai'
|
||||
| 'mistral'
|
||||
| 'ollama'
|
||||
| 'oobabooga'
|
||||
| 'openai'
|
||||
| 'openrouter';
|
||||
|
||||
/** Global: Vendor Instances Registry **/
|
||||
const MODEL_VENDOR_REGISTRY: ModelVendorRegistryType = {
|
||||
const MODEL_VENDOR_REGISTRY: Record<ModelVendorId, IModelVendor> = {
|
||||
anthropic: ModelVendorAnthropic,
|
||||
azure: ModelVendorAzure,
|
||||
googleai: ModelVendorGemini,
|
||||
localai: ModelVendorLocalAI,
|
||||
mistral: ModelVendorMistral,
|
||||
ollama: ModelVendorOllama,
|
||||
oobabooga: ModelVendorOoobabooga,
|
||||
openai: ModelVendorOpenAI,
|
||||
openrouter: ModelVendorOpenRouter,
|
||||
};
|
||||
} as Record<string, IModelVendor>;
|
||||
|
||||
const MODEL_VENDOR_DEFAULT: ModelVendorId = 'openai';
|
||||
|
||||
@@ -31,13 +44,15 @@ export function findAllVendors(): IModelVendor[] {
|
||||
return modelVendors;
|
||||
}
|
||||
|
||||
export function findVendorById(vendorId?: ModelVendorId): IModelVendor | null {
|
||||
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] ?? null) : null;
|
||||
export function findVendorById<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
|
||||
vendorId?: ModelVendorId,
|
||||
): IModelVendor<TSourceSetup, TAccess, TLLMOptions> | null {
|
||||
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] as IModelVendor<TSourceSetup, TAccess, TLLMOptions>) ?? null : null;
|
||||
}
|
||||
|
||||
export function findVendorForLlmOrThrow(llmId: DLLMId) {
|
||||
const llm = findLLMOrThrow(llmId);
|
||||
const vendor = findVendorById(llm?._source.vId);
|
||||
export function findVendorForLlmOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(llmId: DLLMId) {
|
||||
const llm = findLLMOrThrow<TSourceSetup, TLLMOptions>(llmId);
|
||||
const vendor = findVendorById<TSourceSetup, TAccess, TLLMOptions>(llm?._source.vId);
|
||||
if (!vendor) throw new Error(`callChat: Vendor not found for LLM ${llmId}`);
|
||||
return { llm, vendor };
|
||||
}
|
||||
|
||||
@@ -3,9 +3,10 @@ import { createTRPCRouter } from './trpc.server';
|
||||
import { backendRouter } from '~/modules/backend/backend.router';
|
||||
import { elevenlabsRouter } from '~/modules/elevenlabs/elevenlabs.router';
|
||||
import { googleSearchRouter } from '~/modules/google/search.router';
|
||||
import { llmAnthropicRouter } from '~/modules/llms/transports/server/anthropic/anthropic.router';
|
||||
import { llmOllamaRouter } from '~/modules/llms/transports/server/ollama/ollama.router';
|
||||
import { llmOpenAIRouter } from '~/modules/llms/transports/server/openai/openai.router';
|
||||
import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.router';
|
||||
import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
|
||||
import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
|
||||
import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
|
||||
import { prodiaRouter } from '~/modules/prodia/prodia.router';
|
||||
import { ytPersonaRouter } from '../../apps/personas/ytpersona.router';
|
||||
|
||||
@@ -17,6 +18,7 @@ export const appRouterEdge = createTRPCRouter({
|
||||
elevenlabs: elevenlabsRouter,
|
||||
googleSearch: googleSearchRouter,
|
||||
llmAnthropic: llmAnthropicRouter,
|
||||
llmGemini: llmGeminiRouter,
|
||||
llmOllama: llmOllamaRouter,
|
||||
llmOpenAI: llmOpenAIRouter,
|
||||
prodia: prodiaRouter,
|
||||
|
||||
+8
-2
@@ -5,8 +5,8 @@ export const env = createEnv({
|
||||
server: {
|
||||
|
||||
// Backend Postgres, for optional storage via Prisma
|
||||
POSTGRES_PRISMA_URL: z.string().url().optional(),
|
||||
POSTGRES_URL_NON_POOLING: z.string().url().optional(),
|
||||
POSTGRES_PRISMA_URL: z.string().optional(),
|
||||
POSTGRES_URL_NON_POOLING: z.string().optional(),
|
||||
|
||||
// LLM: OpenAI
|
||||
OPENAI_API_KEY: z.string().optional(),
|
||||
@@ -21,6 +21,9 @@ export const env = createEnv({
|
||||
ANTHROPIC_API_KEY: z.string().optional(),
|
||||
ANTHROPIC_API_HOST: z.string().url().optional(),
|
||||
|
||||
// LLM: Google AI's Gemini
|
||||
GEMINI_API_KEY: z.string().optional(),
|
||||
|
||||
// LLM: Mistral
|
||||
MISTRAL_API_KEY: z.string().optional(),
|
||||
|
||||
@@ -62,6 +65,9 @@ export const env = createEnv({
|
||||
throw new Error('Invalid environment variable');
|
||||
},
|
||||
|
||||
// matches user expectations - see https://github.com/enricoros/big-AGI/issues/279
|
||||
emptyStringAsUndefined: true,
|
||||
|
||||
// with Noext.JS >= 13.4.4 we'd only need to destructure client variables
|
||||
experimental__runtimeEnv: {},
|
||||
});
|
||||
Reference in New Issue
Block a user