Compare commits

..

80 Commits

Author SHA1 Message Date
Enrico Ros 0fc83cf6f5 Merge branch 'release-1.8.0' 2023-12-20 02:38:51 -08:00
Enrico Ros 2949feccd5 Maintainers Release 2023-12-20 02:32:47 -08:00
Enrico Ros d6f1c2da81 1.8.0: Readme and Changelog 2023-12-20 02:11:13 -08:00
Enrico Ros fabb433fde 1.8.0: news.data.tsx 2023-12-20 01:54:23 -08:00
Enrico Ros b57445eb14 1.8.0: Version 2023-12-20 01:11:08 -08:00
Enrico Ros 5f8f4aba78 Ollama: update models 2023-12-20 00:59:14 -08:00
Enrico Ros d693cdaeba Ollama: update admin panel 2023-12-20 00:59:03 -08:00
Enrico Ros 39fbcfd97b OpenRouter: update models 2023-12-20 00:55:27 -08:00
Enrico Ros 7694bc3d52 OpenRouter: update models 2023-12-20 00:53:16 -08:00
Enrico Ros 7f21b2ac3d Merge branch 'feature-gemini'
Fixes #275
2023-12-20 00:16:44 -08:00
Enrico Ros fdb66da1a7 Gemini: choose a content filtering threshold 2023-12-20 00:14:53 -08:00
Enrico Ros 6b62a6733b Gemini: show block reason 2023-12-20 00:14:53 -08:00
Enrico Ros 5d62056807 Streaming: muxing format 2023-12-20 00:14:53 -08:00
Enrico Ros efff7126af Gemini: final touches 2023-12-20 00:14:53 -08:00
Enrico Ros 45046c70ed Gemini: stream on 2023-12-20 00:14:53 -08:00
Enrico Ros 7b5b852793 Gemini: trim key 2023-12-20 00:14:53 -08:00
Enrico Ros 9952b757b8 Gemini: client version 2023-12-20 00:14:53 -08:00
Enrico Ros b08ecc9012 Models Modal: improve caps 2023-12-20 00:14:53 -08:00
Enrico Ros bc5a38fa89 Models List: show a helpful message 2023-12-20 00:14:53 -08:00
Enrico Ros bee49a4b1c Llms: streaming as a vendor function (then all directed to the unified) 2023-12-20 00:14:53 -08:00
Enrico Ros 0ece1ce58c Llms: vendor-specific RPC to ChatGenerate 2023-12-20 00:14:53 -08:00
Enrico Ros fd897b55b2 Llms: improve list generics 2023-12-20 00:14:53 -08:00
Enrico Ros dd41a402d0 Llms: move models modal 2023-12-20 00:14:53 -08:00
Enrico Ros 3f9defd18c Llms: restructure 2023-12-20 00:14:53 -08:00
Enrico Ros 49c77f5a10 Llms: cleanup model lists (bits) 2023-12-20 00:14:52 -08:00
Enrico Ros 6b2bfa6060 Llms: cleanup model lists 2023-12-20 00:14:52 -08:00
Enrico Ros 8e3f247bfb Gemini: cleaner 2023-12-20 00:14:52 -08:00
Enrico Ros 201e3a7252 Streaming: cleanup 2023-12-20 00:14:52 -08:00
Enrico Ros 044ed4df79 Bits for the future 2023-12-20 00:14:52 -08:00
Enrico Ros 0df7297cca Gemini: configuration, list models, and immediate generation 2023-12-20 00:14:52 -08:00
Enrico Ros 453a3e5751 LLM Vendors: auto IDs 2023-12-20 00:14:52 -08:00
Enrico Ros 34c1c425b9 Gemini: backend env var 2023-12-20 00:14:52 -08:00
Enrico Ros e0a010189f LLMOptions Modal: fix display 2023-12-20 00:14:52 -08:00
Enrico Ros 7a07f10ed1 Move ModelVendor enum 2023-12-20 00:14:52 -08:00
Enrico Ros 33cb2b84b2 Anthropic: allow for 39 chars sks 2023-12-20 00:13:58 -08:00
Enrico Ros 3adec85e1f Fix shortcuts on Mac. 2023-12-18 19:59:03 -08:00
Enrico Ros 18cfe5e296 DB: drop URL validation for POSTGRES_PRISMA_URL. #277 2023-12-18 15:16:02 -08:00
Enrico Ros 566ba366b4 Merge pull request #280
[Visualize] Add custom instruction #218
2023-12-18 12:19:03 -08:00
Enrico Ros 7ed653b315 Fix. 2023-12-18 04:54:04 -08:00
Enrico Ros cb333c33d7 Better 1-click deployment, fixes #279 2023-12-18 03:22:18 -08:00
Joris Kalz 22ba37074b [Visualize] Add custom instruction #218 2023-12-16 23:22:47 +01:00
Enrico Ros 84d7b7644a Ollama: update models 2023-12-15 15:48:41 -08:00
Enrico Ros 71445dafc8 Ollama: improved diagram 2023-12-15 15:29:56 -08:00
Enrico Ros 66a5ad7f00 Ollama: update md 2023-12-15 15:27:11 -08:00
Enrico Ros 09f80adfaa Ollama: update md 2023-12-15 15:26:38 -08:00
Enrico Ros 9febd97065 Ollama: update md 2023-12-15 15:24:48 -08:00
Enrico Ros 5219f9928d Ollama: update md 2023-12-15 15:24:13 -08:00
Enrico Ros aec9f4665f Update config-ollama.md 2023-12-15 15:23:48 -08:00
Enrico Ros db48465204 Ollama: document network issue resolution. #276 2023-12-15 15:20:33 -08:00
Enrico Ros c2c858730a Bite the bullet with Zustand 2023-12-13 14:57:06 -08:00
Enrico Ros 402bde9a81 Newpad 2023-12-13 02:06:19 -08:00
Enrico Ros ba1c0ba0d9 Enforce a Single instance (Tab) of the app. Closes #268 2023-12-13 00:09:56 -08:00
Enrico Ros 084d77cd78 Linting 2023-12-12 18:24:59 -08:00
Enrico Ros 30c17a9b73 Roll Joy 2023-12-12 18:10:46 -08:00
Enrico Ros 2442463da3 deploy-docker.md: update Official guide 2023-12-12 17:52:28 -08:00
Enrico Ros 84a3e8cfdb Fix docker-compose to point to the 'latest' (stable) version, instead of the no more existing 'main' 2023-12-12 17:17:30 -08:00
Enrico Ros 6ae440d252 1.7.3: Patch release for Mistral support 2023-12-12 17:01:40 -08:00
Enrico Ros c0c724afc1 Mistral Platform: full support
Closes #273.
2023-12-12 16:39:06 -08:00
Enrico Ros a265112ce1 Mistral Platform: backend-configurable support (#273) 2023-12-12 16:39:06 -08:00
Enrico Ros 75605ed408 Dropdown: support model vendor icons 2023-12-12 16:39:06 -08:00
Enrico Ros ad38ff4157 LLMs: safer and smarter access 2023-12-12 16:39:06 -08:00
Enrico Ros 08c60e53b1 LLMs: reorder template params 2023-12-12 16:39:06 -08:00
Enrico Ros d0dcb2ac02 LLMs: getTransportAccess 2023-12-12 16:39:06 -08:00
Enrico Ros fbeb604b26 Update README.md 2023-12-12 03:42:05 -08:00
Enrico Ros c4f3b1df77 Update README.md 2023-12-12 03:40:44 -08:00
Enrico Ros 5a1f9caaac Roll rest 2023-12-12 03:16:35 -08:00
Enrico Ros 2fc70d5e95 Roll other dev deps 2023-12-12 03:12:43 -08:00
Enrico Ros 43adadef78 Roll Material/Joy/Next 2023-12-12 03:11:14 -08:00
Enrico Ros 96f6e7628b Roll Prisma 2023-12-12 03:08:10 -08:00
Enrico Ros 32ad82bcee Drag/Drop: do not remove the text from the source 2023-12-12 03:07:31 -08:00
Enrico Ros 3d72aec369 Roll pdfjs-dist 2023-12-12 02:58:06 -08:00
Enrico Ros d244ee2cca Update Docker image workflow.
Assume the vX.Y.Z is the latest (and will have the latest tag). Removing this to remove the 'stable' tag, as
latest is better.

The 'main' branch keeps the development tag.
2023-12-12 01:38:57 -08:00
Enrico Ros cc8a235ae3 Bits 2023-12-12 01:21:43 -08:00
Enrico Ros ae348812de OpenRouter: improve showing of discounted models 2023-12-12 01:14:33 -08:00
Enrico Ros 6053636f66 OpenRouter: OAuth login support 2023-12-11 22:35:40 -08:00
Enrico Ros f2e2aee672 1.7.2: Stable Patch Version 2023-12-11 21:22:31 -08:00
Enrico Ros 11cbb2bbf0 OpenRouter: update models 2023-12-11 21:21:22 -08:00
Enrico Ros 30bd19d6ce HTML Table to Markdown Table: improve reliability and ignore hidden data 2023-12-11 20:46:34 -08:00
Enrico Ros d0b5c02062 Improve how Stream errors are shown 2023-12-11 18:22:15 -08:00
Enrico Ros 771192e406 Ollama: support ollama errors via API 2023-12-11 18:19:38 -08:00
97 changed files with 2904 additions and 1129 deletions
@@ -65,7 +65,11 @@ I need the following from you:
### GitHub release
Now paste the former release (or 1.5.0 which was accurate and great), including the new contributors and
```markdown
Please create the 1.2.3 Release Notes for GitHub. The following were the Release Notes for 1.1.0. Use a truthful and honest tone, undestanding that people's time and attention span is short. Today is 2023-12-20.
```
Now paste-attachment the former release notes (or 1.5.0 which was accurate and great), including the new contributors and
some stats (# of commits, etc.), and roll it for the new release.
### Discord announcement
+1 -1
View File
@@ -13,7 +13,7 @@ on:
push:
branches:
- main
- main-stable # Trigger on pushes to the main-stable branch
#- main-stable # Disabled as the v* tag is used for stable releases
tags:
- 'v*' # Trigger on version tags (e.g., v1.7.0)
+18 -7
View File
@@ -1,8 +1,8 @@
# BIG-AGI 🧠✨
Welcome to big-AGI 👋, the GPT application for professionals that need form, function,
simplicity, and speed. Powered by the latest models from 7 vendors, including
open-source, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
Welcome to big-AGI 👋, the GPT application for professionals that need function, form,
simplicity, and speed. Powered by the latest models from 8 vendors and
open-source model servers, `big-AGI` offers best-in-class Voice and Chat with AI Personas,
visualizations, coding, drawing, calling, and quite more -- all in a polished UX.
Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
@@ -11,7 +11,7 @@ Pros use big-AGI. 🚀 Developers love big-AGI. 🤖
Or fork & run on Vercel
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
## 👉 [roadmap](https://github.com/users/enricoros/projects/4/views/2)
@@ -21,7 +21,19 @@ shows the current developments and future ideas.
- Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
- Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_
### What's New in 1.7.1 · Dec 11, 2023 · Attachment Theory 🌟
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
- Mac Shortcuts Fix: Improved UX on Mac
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
### What's New in 1.7.0 · Dec 11, 2023
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -31,7 +43,6 @@ shows the current developments and future ideas.
- Optimized Voice Input and Performance
- Latest Ollama and Oobabooga models
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
### What's New in 1.6.0 - Nov 28, 2023
@@ -146,7 +157,7 @@ Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare
Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY,OPENAI_API_HOST&envDescription=OpenAI%20KEY%20for%20your%20deployment.%20Set%20HOST%20only%20if%20non-default.)
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
## Integrations:
+1 -1
View File
@@ -1,2 +1,2 @@
export const runtime = 'edge';
export { openaiStreamingRelayHandler as POST } from '~/modules/llms/transports/server/openai/openai.streaming';
export { llmStreamingRelayHandler as POST } from '~/modules/llms/server/llm.server.streaming';
+1 -1
View File
@@ -6,7 +6,7 @@ version: '3.9'
services:
big-agi:
image: ghcr.io/enricoros/big-agi:main
image: ghcr.io/enricoros/big-agi:latest
ports:
- "3000:3000"
env_file:
+15 -4
View File
@@ -5,12 +5,24 @@ by release.
- For the live roadmap, please see [the GitHub project](https://github.com/users/enricoros/projects/4/views/2)
### 1.8.0 - Dec 2023
### 1.9.0 - Dec 2023
- work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
- milestone: [1.9.0](https://github.com/enricoros/big-agi/milestone/9)
### What's New in 1.7.1 · Dec 11, 2023 · Attachment Theory 🌟
### What's New in 1.8.0 · Dec 20, 2023 · To The Moon And Back · 🚀🌕🔙
- **Google Gemini Support**: Use the newest Google models. [#275](https://github.com/enricoros/big-agi/issues/275)
- **Mistral Platform**: Mixtral and future models support. [#273](https://github.com/enricoros/big-agi/issues/273)
- **Diagram Instructions**. Thanks to @joriskalz! [#280](https://github.com/enricoros/big-agi/pull/280)
- Ollama Chats: Enhanced chatting experience. [#270](https://github.com/enricoros/big-agi/issues/270)
- Mac Shortcuts Fix: Improved UX on Mac
- **Single-Tab Mode**: Data integrity with single window. [#268](https://github.com/enricoros/big-agi/issues/268)
- **Updated Models**: Latest Ollama (v0.1.17) and OpenRouter models
- Official Downloads: Easy access to the latest big-AGI on [big-AGI.com](https://big-agi.com)
- For developers: [troubleshot networking](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483), fixed Vercel deployment, cleaned up the LLMs/Streaming framework
### What's New in 1.7.0 · Dec 11, 2023 · Attachment Theory
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -20,7 +32,6 @@ by release.
- Optimized Voice Input and Performance
- Latest Ollama and Oobabooga models
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up
+1 -1
View File
@@ -30,5 +30,5 @@ For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/ba
> NOTE: LocalAI does not list details about the mdoels. Every model is assumed to be
> capable of chatting, and with a context window of 4096 tokens.
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/transports/server/openai/models.data.ts)
> Please update the [src/modules/llms/transports/server/openai/models.data.ts](../src/modules/llms/server/openai/models.data.ts)
> file with the mapping information between LocalAI model IDs and names/descriptions/tokens, etc.
+24 -12
View File
@@ -5,13 +5,15 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
editing tools, models switching, personas, and more.
_Last updated Dec 11, 2023_
_Last updated Dec 16, 2023_
![config-local-ollama-0-example.png](pixels/config-ollama-0-example.png)
## Quick Integration Guide
1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
- For detailed instructions on setting up the Ollama API server, please refer to the
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
@@ -20,21 +22,29 @@ _Last updated Dec 11, 2023_
you'll have to press the 'Pull' button again, until a green message appears.
5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas
### Ollama: installation and Setup
**Visual Configuration Guide**:
For detailed instructions on setting up the Ollama API server, please refer to the
[Ollama download page](https://ollama.ai/download) and [instructions for linux](https://github.com/jmorganca/ollama/blob/main/docs/linux.md).
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:<br/>
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" width="320">
### Visual Guide
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:<br/>
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" width="320">
* After adding the `Ollama` model vendor, entering the IP address of an Ollama server, and refreshing models:
<img src="pixels/config-ollama-1-models.png" alt="config-local-ollama-1-models.png" style="max-width: 320px;">
* You can now switch model/persona dynamically and text/voice chat with the models:<br/>
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" width="320">
* The `Ollama` admin panel, with the `Pull` button highlighted, after pulling the "Yi" model:
<img src="pixels/config-ollama-2-admin-pull.png" alt="config-local-ollama-2-admin-pull.png" style="max-width: 320px;">
<br/>
* You can now switch model/persona dynamically and text/voice chat with the models:
<img src="pixels/config-ollama-3-chat.png" alt="config-local-ollama-3-chat.png" style="max-width: 320px;">
### ⚠️ Network Troubleshooting
If you get errors about the server having trouble connecting with Ollama, please see
[this message](https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483) on Issue #276.
And in brief, make sure the Ollama endpoint is accessible from the servers where you run big-AGI (which could
be localhost or cloud servers).
![Ollama Networking Chart](pixels/config-ollama-network.png)
<br/>
### Advanced: Model parameters
@@ -73,6 +83,8 @@ Then, edit the nginx configuration file `/etc/nginx/sites-enabled/default` and a
Reach out to our community if you need help with this.
<br/>
### Community and Support
Join our community to share your experiences, get help, and discuss best practices:
@@ -83,4 +95,4 @@ Join our community to share your experiences, get help, and discuss best practic
---
`big-AGI` is committed to providing a powerful, intuitive, and privacy-respecting AI experience.
We are excited for you to explore the possibilities with Ollama models. Happy creating!
We are excited for you to explore the possibilities with Ollama models. Happy creating!
+37 -20
View File
@@ -21,33 +21,23 @@ Docker ensures faster development cycles, easier collaboration, and seamless env
```
4. Browse to [http://localhost:3000](http://localhost:3000)
## Documentation
<br/>
The big-AGI repository includes a Dockerfile and a GitHub Actions workflow for building and publishing a
Docker image of the application.
## Run Official Containers 📦
### Dockerfile
`big-AGI` is pre-built from source code and published as a Docker image on the GitHub Container Registry (ghcr).
The build process is transparent, and happens via GitHub Actions, as described in the
file.
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
installs dependencies, and creates a production-ready version of the application as a local container.
### Official Images: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
### Official container images
#### Run using *docker* 🚀
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file automates the
building and publishing of the Docker images to the GitHub Container Registry (ghcr) when changes are
pushed to the `main` branch.
Official pre-built containers: [ghcr.io/enricoros/big-agi](https://github.com/enricoros/big-agi/pkgs/container/big-agi)
Run official pre-built containers:
```bash
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi:latest
```
### Run official containers
In addition, the repository also includes a `docker-compose.yaml` file, configured to run the pre-built
'ghcr image'. This file is used to define the `big-agi` service, the ports to expose, and the command to run.
#### Run using *docker-compose* 🚀
If you have Docker Compose installed, you can run the Docker container with `docker-compose up`
to pull the Docker image (if it hasn't been pulled already) and start a Docker container. If you want to
@@ -57,4 +47,31 @@ update the image to the latest version, you can run `docker-compose pull` before
docker-compose up -d
```
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment.
### Make Local Services Visible to Docker 🌐
To make local services running on your host machine accessible to a Docker container, such as a
[Browseless](./config-browse.md) service or a local API, you can follow this simplified guide:
| Operating System | Steps to Make Local Services Visible to Docker |
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Windows and macOS | Use the special DNS name `host.docker.internal` to refer to the host machine from within the Docker container. No additional network configuration is required. Access local services using `host.docker.internal:<PORT>`. |
| Linux | Two options: *A*. Use <ins>--network="host"</ins> (`docker run --network="host" -d big-agi`) when running the Docker container to merge the container within the host network stack; however, this reduces container isolation. Alternatively: *B*. Connect to local services <ins>using the host's IP address</ins> directly, as host.docker.internal is not available by default on Linux. |
<br/>
### More Information
The [`Dockerfile`](../Dockerfile) describes how to create a Docker image. It establishes a Node.js environment,
installs dependencies, and creates a production-ready version of the application as a local container.
The [`docker-compose.yaml`](../docker-compose.yaml) file is configured to run the
official image (big-agi:latest). This file is used to define the `big-agi` service, to expose
port 3000 on the host, and launch big-AGI within the container (startup command).
The [`.github/workflows/docker-image.yml`](../.github/workflows/docker-image.yml) file is used
to build the Official Docker images and publish them to the GitHub Container Registry (ghcr).
The build process is transparent and happens via GitHub Actions.
<br/>
Leverage Docker's capabilities for a reliable and efficient big-AGI deployment!
+1 -1
View File
@@ -12,7 +12,7 @@ version: '3.9'
services:
big-agi:
image: ghcr.io/enricoros/big-agi:main
image: ghcr.io/enricoros/big-agi:latest
ports:
- "3000:3000"
env_file:
+6 -5
View File
@@ -24,6 +24,8 @@ AZURE_OPENAI_API_ENDPOINT=
AZURE_OPENAI_API_KEY=
ANTHROPIC_API_KEY=
ANTHROPIC_API_HOST=
GEMINI_API_KEY=
MISTRAL_API_KEY=
OLLAMA_API_HOST=
OPENROUTER_API_KEY=
@@ -45,7 +47,7 @@ PUPPETEER_WSS_ENDPOINT=
# Backend Analytics
BACKEND_ANALYTICS=
# Backend HTTP Basic Authentication
# Backend HTTP Basic Authentication (see `deploy-authentication.md` for turning on authentication)
HTTP_BASIC_AUTH_USERNAME=
HTTP_BASIC_AUTH_PASSWORD=
```
@@ -79,6 +81,8 @@ requiring the user to enter an API key
| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md) | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
| `ANTHROPIC_API_KEY` | The API key for Anthropic | Optional |
| `ANTHROPIC_API_HOST` | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional |
| `GEMINI_API_KEY` | The API key for Google AI's Gemini | Optional |
| `MISTRAL_API_KEY` | The API key for Mistral | Optional |
| `OLLAMA_API_HOST` | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md) | |
| `OPENROUTER_API_KEY` | The API key for OpenRouter | Optional |
@@ -113,10 +117,7 @@ Enable the app to Talk, Draw, and Google things up.
| `PUPPETEER_WSS_ENDPOINT` | Puppeteer WebSocket endpoint - used for browsing, etc. |
| **Backend** | |
| `BACKEND_ANALYTICS` | Semicolon-separated list of analytics flags (see backend.analytics.ts). Flags: `domain` logs the responding domain. |
| `HTTP_BASIC_AUTH_USERNAME` | Username for HTTP Basic Authentication. See the [Authentication](deploy-authentication.md) guide. |
| `HTTP_BASIC_AUTH_USERNAME` | See the [Authentication](deploy-authentication.md) guide. Username for HTTP Basic Authentication. |
| `HTTP_BASIC_AUTH_PASSWORD` | Password for HTTP Basic Authentication. |
---
Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

+494 -256
View File
File diff suppressed because it is too large Load Diff
+17 -17
View File
@@ -1,6 +1,6 @@
{
"name": "big-agi",
"version": "1.7.1",
"version": "1.8.0",
"private": true,
"scripts": {
"dev": "next dev",
@@ -18,13 +18,13 @@
"@emotion/react": "^11.11.1",
"@emotion/server": "^11.11.0",
"@emotion/styled": "^11.11.0",
"@mui/icons-material": "^5.14.18",
"@mui/joy": "^5.0.0-beta.15",
"@next/bundle-analyzer": "^14.0.3",
"@prisma/client": "^5.6.0",
"@mui/icons-material": "^5.15.0",
"@mui/joy": "^5.0.0-beta.18",
"@next/bundle-analyzer": "^14.0.4",
"@prisma/client": "^5.7.0",
"@sanity/diff-match-patch": "^3.1.1",
"@t3-oss/env-nextjs": "^0.7.1",
"@tanstack/react-query": "^4.36.1",
"@tanstack/react-query": "~4.36.1",
"@trpc/client": "^10.44.1",
"@trpc/next": "^10.44.1",
"@trpc/react-query": "^10.44.1",
@@ -33,8 +33,8 @@
"browser-fs-access": "^0.35.0",
"eventsource-parser": "^1.1.1",
"idb-keyval": "^6.2.1",
"next": "^14.0.3",
"pdfjs-dist": "4.0.189",
"next": "^14.0.4",
"pdfjs-dist": "4.0.269",
"plantuml-encoder": "^1.4.0",
"prismjs": "^1.29.0",
"react": "^18.2.0",
@@ -47,23 +47,23 @@
"tesseract.js": "^5.0.3",
"uuid": "^9.0.1",
"zod": "^3.22.4",
"zustand": "~4.3.9"
"zustand": "^4.4.7"
},
"devDependencies": {
"@cloudflare/puppeteer": "^0.0.5",
"@types/node": "^20.10.0",
"@types/node": "^20.10.4",
"@types/plantuml-encoder": "^1.4.2",
"@types/prismjs": "^1.26.3",
"@types/react": "^18.2.38",
"@types/react": "^18.2.45",
"@types/react-dom": "^18.2.17",
"@types/react-katex": "^3.0.3",
"@types/react-katex": "^3.0.4",
"@types/react-timeago": "^4.1.6",
"@types/uuid": "^9.0.7",
"eslint": "^8.54.0",
"eslint-config-next": "^14.0.3",
"prettier": "^3.1.0",
"prisma": "^5.6.0",
"typescript": "^5.3.2"
"eslint": "^8.55.0",
"eslint-config-next": "^14.0.4",
"prettier": "^3.1.1",
"prisma": "^5.7.0",
"typescript": "^5.3.3"
},
"engines": {
"node": "^20.0.0 || ^18.0.0"
+10 -7
View File
@@ -11,6 +11,7 @@ import '~/common/styles/CodePrism.css';
import '~/common/styles/GithubMarkdown.css';
import { ProviderBackend } from '~/common/state/ProviderBackend';
import { ProviderSingleTab } from '~/common/state/ProviderSingleTab';
import { ProviderSnacks } from '~/common/state/ProviderSnacks';
import { ProviderTRPCQueryClient } from '~/common/state/ProviderTRPCQueryClient';
import { ProviderTheming } from '~/common/state/ProviderTheming';
@@ -25,13 +26,15 @@ const MyApp = ({ Component, emotionCache, pageProps }: MyAppProps) =>
</Head>
<ProviderTheming emotionCache={emotionCache}>
<ProviderTRPCQueryClient>
<ProviderSnacks>
<ProviderBackend>
<Component {...pageProps} />
</ProviderBackend>
</ProviderSnacks>
</ProviderTRPCQueryClient>
<ProviderSingleTab>
<ProviderTRPCQueryClient>
<ProviderSnacks>
<ProviderBackend>
<Component {...pageProps} />
</ProviderBackend>
</ProviderSnacks>
</ProviderTRPCQueryClient>
</ProviderSingleTab>
</ProviderTheming>
<VercelAnalytics debug={false} />
+98
View File
@@ -0,0 +1,98 @@
import * as React from 'react';
import { useRouter } from 'next/router';
import { Box, Typography } from '@mui/joy';
import { useModelsStore } from '~/modules/llms/store-llms';
import { AppLayout } from '~/common/layout/AppLayout';
import { InlineError } from '~/common/components/InlineError';
import { apiQuery } from '~/common/util/trpc.client';
import { navigateToIndex } from '~/common/app.routes';
import { openLayoutModelsSetup } from '~/common/layout/store-applayout';
function CallbackOpenRouterPage(props: { openRouterCode: string | undefined }) {
// external state
const { data, isError, error, isLoading } = apiQuery.backend.exchangeOpenRouterKey.useQuery({ code: props.openRouterCode || '' }, {
enabled: !!props.openRouterCode,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
// derived state
const isErrorInput = !props.openRouterCode;
const openRouterKey = data?.key ?? undefined;
const isSuccess = !!openRouterKey;
// Success: save the key and redirect to the chat app
React.useEffect(() => {
if (!isSuccess)
return;
// 1. Save the key as the client key
useModelsStore.getState().setOpenRoutersKey(openRouterKey);
// 2. Navigate to the chat app
navigateToIndex(true).then(() => openLayoutModelsSetup());
}, [isSuccess, openRouterKey]);
return (
<Box sx={{
flexGrow: 1,
backgroundColor: 'background.level1',
overflowY: 'auto',
display: 'flex', justifyContent: 'center',
p: { xs: 3, md: 6 },
}}>
<Box sx={{
// my: 'auto',
display: 'flex', flexDirection: 'column', alignItems: 'center',
gap: 4,
}}>
<Typography level='title-lg'>
Welcome Back
</Typography>
{isLoading && <Typography level='body-sm'>Loading...</Typography>}
{isErrorInput && <InlineError error='There was an issue retrieving the code from OpenRouter.' />}
{isError && <InlineError error={error} />}
{data && (
<Typography level='body-md'>
Success! You can now close this window.
</Typography>
)}
</Box>
</Box>
);
}
/**
* This page will be invoked by OpenRouter as a Callback
*
* Docs: https://openrouter.ai/docs#oauth
* Example URL: https://localhost:3000/link/callback_openrouter?code=SomeCode
*/
export default function Page() {
// get the 'code=...' from the URL
const { query } = useRouter();
const { code: openRouterCode } = query;
return (
<AppLayout suspendAutoModelsSetup>
<CallbackOpenRouterPage openRouterCode={openRouterCode as (string | undefined)} />
</AppLayout>
);
}
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+2 -3
View File
@@ -15,8 +15,7 @@ import { useChatLLMDropdown } from '../chat/components/applayout/useLLMDropdown'
import { EXPERIMENTAL_speakTextStream } from '~/modules/elevenlabs/elevenlabs.client';
import { SystemPurposeId, SystemPurposes } from '../../data';
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import { streamChat } from '~/modules/llms/transports/streamChat';
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
import { useElevenLabsVoiceDropdown } from '~/modules/elevenlabs/useElevenLabsVoiceDropdown';
import { Link } from '~/common/components/Link';
@@ -216,7 +215,7 @@ export function CallUI(props: {
responseAbortController.current = new AbortController();
let finalText = '';
let error: any | null = null;
streamChat(chatLLMId, callPrompt, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
llmStreamingChatGenerate(chatLLMId, callPrompt, null, null, responseAbortController.current.signal, (updatedMessage: Partial<DMessage>) => {
const text = updatedMessage.text?.trim();
if (text) {
finalText = text;
+1 -1
View File
@@ -3,7 +3,7 @@ import * as React from 'react';
import { Chip, ColorPaletteProp, VariantProp } from '@mui/joy';
import { SxProps } from '@mui/joy/styles/types';
import { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import type { VChatMessageIn } from '~/modules/llms/llm.client';
export function CallMessage(props: {
@@ -23,14 +23,26 @@ function AppBarLLMDropdown(props: {
const llmItems: DropdownItems = {};
let prevSourceId: DModelSourceId | null = null;
for (const llm of props.llms) {
if (!llm.hidden || llm.id === props.chatLlmId) {
if (!prevSourceId || llm.sId !== prevSourceId) {
if (prevSourceId)
llmItems[`sep-${llm.id}`] = { type: 'separator', title: llm.sId };
prevSourceId = llm.sId;
}
llmItems[llm.id] = { title: llm.label };
// filter-out hidden models
if (!(!llm.hidden || llm.id === props.chatLlmId))
continue;
// add separators when changing sources
if (!prevSourceId || llm.sId !== prevSourceId) {
if (prevSourceId)
llmItems[`sep-${llm.id}`] = {
type: 'separator',
title: llm.sId,
};
prevSourceId = llm.sId;
}
// add the model item
llmItems[llm.id] = {
title: llm.label,
// icon: llm.id.startsWith('some vendor') ? <VendorIcon /> : undefined,
};
}
const handleChatLLMChange = (_event: any, value: DLLMId | null) => value && props.setChatLlmId(value);
@@ -331,7 +331,8 @@ export function Composer(props: {
const handleOverlayDragOver = React.useCallback((e: React.DragEvent) => {
eatDragEvent(e);
// e.dataTransfer.dropEffect = 'copy';
// this makes sure we don't "transfer" (or move) the attachment, but we tell the sender we'll copy it
e.dataTransfer.dropEffect = 'copy';
}, [eatDragEvent]);
const handleOverlayDrop = React.useCallback(async (event: React.DragEvent) => {
@@ -254,7 +254,7 @@ export async function attachmentPerformConversion(attachment: Readonly<Attachmen
case 'rich-text-table':
let mdTable: string;
try {
mdTable = htmlTableToMarkdown(input.altData!);
mdTable = htmlTableToMarkdown(input.altData!, false);
} catch (error) {
// fallback to text/plain
mdTable = inputDataToString(input.data);
@@ -167,6 +167,8 @@ function explainErrorInMessage(text: string, isAssistant: boolean, modelId?: str
make sure the usage is under <Link noLinkStyle href='https://platform.openai.com/account/billing/limits' target='_blank'>the limits</Link>.
</>;
}
// else
// errorMessage = <>{text || 'Unknown error'}</>;
return { errorMessage, isAssistantError };
}
+2 -2
View File
@@ -2,8 +2,8 @@ import { DLLMId } from '~/modules/llms/store-llms';
import { SystemPurposeId } from '../../../data';
import { autoSuggestions } from '~/modules/aifn/autosuggestions/autoSuggestions';
import { autoTitle } from '~/modules/aifn/autotitle/autoTitle';
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
import { speakText } from '~/modules/elevenlabs/elevenlabs.client';
import { streamChat } from '~/modules/llms/transports/streamChat';
import { DMessage, useChatStore } from '~/common/state/store-chats';
@@ -63,7 +63,7 @@ async function streamAssistantMessage(
const messages = history.map(({ role, text }) => ({ role, content: text }));
try {
await streamChat(llmId, messages, abortSignal,
await llmStreamingChatGenerate(llmId, messages, null, null, abortSignal,
(updatedMessage: Partial<DMessage>) => {
// update the message in the store (and thus schedule a re-render)
editMessage(updatedMessage);
+3 -3
View File
@@ -78,14 +78,14 @@ export function AppNews() {
{!!news && <Container disableGutters maxWidth='sm'>
{news?.map((ni, idx) => {
const firstCard = idx === 0;
// const firstCard = idx === 0;
const hasCardAfter = news.length < NewsItems.length;
const showExpander = hasCardAfter && (idx === news.length - 1);
const addPadding = false; //!firstCard; // || showExpander;
return <Card key={'news-' + idx} sx={{ mb: 2, minHeight: 32 }}>
<CardContent sx={{ position: 'relative', pr: addPadding ? 4 : 0 }}>
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 1 }}>
<GoodTooltip title={ni.versionName || null} placement='top-start'>
<Box sx={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', gap: 0 }}>
<GoodTooltip title={ni.versionName ? `${ni.versionName} ${ni.versionMoji || ''}` : null} placement='top-start'>
<Typography level='title-sm' component='div' sx={{ flexGrow: 1 }}>
{ni.text ? ni.text : ni.versionName ? `${ni.versionCode} · ${ni.versionName}` : `Version ${ni.versionCode}:`}
</Typography>
+28 -10
View File
@@ -10,10 +10,10 @@ import { platformAwareKeystrokes } from '~/common/components/KeyStroke';
// update this variable every time you want to broadcast a new version to clients
export const incrementalVersion: number = 8;
export const incrementalVersion: number = 9;
const B = (props: { href?: string, children: React.ReactNode }) => {
const boldText = <Typography color={!!props.href ? 'primary' : 'warning'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
const boldText = <Typography color={!!props.href ? 'primary' : 'neutral'} sx={{ fontWeight: 600 }}>{props.children}</Typography>;
return props.href ?
<Link href={props.href + clientUtmSource()} target='_blank' sx={{ /*textDecoration: 'underline'*/ }}>{boldText} <LaunchIcon sx={{ ml: 1 }} /></Link> :
boldText;
@@ -27,11 +27,12 @@ const RIssues = `${OpenRepo}/issues`;
export const newsCallout =
<Card>
<CardContent sx={{ gap: 2 }}>
<Typography level='h4'>
<Typography level='title-lg'>
Open Roadmap
</Typography>
<Typography>
The roadmap is officially out. For the first time you get a look at what&apos;s brewing, up and coming, and get a chance to pick up cool features!
<Typography level='body-md'>
Take a peek at our roadmap to see what&apos;s in the pipeline.
Discover upcoming features and let us know what excites you the most!
</Typography>
<Grid container spacing={1}>
<Grid xs={12} sm={7}>
@@ -39,7 +40,7 @@ export const newsCallout =
fullWidth variant='soft' color='primary' endDecorator={<LaunchIcon />}
component={Link} href={OpenProject} noLinkStyle target='_blank'
>
Explore the Roadmap
Explore
</Button>
</Grid>
<Grid xs={12} sm={5} sx={{ display: 'flex', flexAlign: 'center', justifyContent: 'center' }}>
@@ -67,10 +68,27 @@ export const NewsItems: NewsItem[] = [
],
},*/
{
versionCode: '1.7.1',
versionCode: '1.8.0',
versionName: 'To The Moon And Back',
versionMoji: '🚀🌕🔙❤️',
versionDate: new Date('2023-12-20T09:30:00Z'),
items: [
{ text: <><B href={RIssues + '/275'}>Google Gemini</B> models support</> },
{ text: <><B href={RIssues + '/273'}>Mistral Platform</B> support</> },
{ text: <><B href={RIssues + '/270'}>Ollama chats</B> perfection</> },
{ text: <>Custom <B href={RIssues + '/280'}>diagrams instructions</B> (@joriskalz)</> },
{ text: <><B>Single-Tab</B> mode, enhances data integrity and prevents DB corruption</> },
{ text: <>Updated Ollama (v0.1.17) and OpenRouter models</> },
{ text: <>More: fixed shortcuts on Mac</> },
{ text: <><Link href='https://big-agi.com'>Website</Link>: official downloads</> },
{ text: <>Easier Vercel deployment, documented <Link href='https://github.com/enricoros/big-AGI/issues/276#issuecomment-1858591483'>network troubleshooting</Link></>, dev: true },
],
},
{
versionCode: '1.7.0',
versionName: 'Attachment Theory',
versionDate: new Date('2023-12-11T06:00:00Z'), // new Date().toISOString()
// versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
// versionDate: new Date('2023-12-11T06:00:00Z'), // 1.7.3
versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
items: [
{ text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
{ text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
@@ -80,7 +98,6 @@ export const NewsItems: NewsItem[] = [
{ text: <>{platformAwareKeystrokes('Ctrl+Shift+O')}: quick access to model options</> },
{ text: <>Optimized voice input and performance</> },
{ text: <>Latest Ollama and Oobabooga models</> },
{ text: <>1.7.1: Improved <B href={RIssues + '/270'}>Ollama chats</B></> },
],
},
{
@@ -160,6 +177,7 @@ export const NewsItems: NewsItem[] = [
interface NewsItem {
versionCode: string;
versionName?: string;
versionMoji?: string;
versionDate?: Date;
text?: string | React.JSX.Element;
items?: {
+3 -4
View File
@@ -1,14 +1,13 @@
import * as React from 'react';
import { shallow } from 'zustand/shallow';
import { useRouter } from 'next/router';
import { navigateToNews } from '~/common/app.routes';
import { useAppStateStore } from '~/common/state/store-appstate';
import { incrementalVersion } from './news.data';
export function useShowNewsOnUpdate() {
const { push: routerPush } = useRouter();
const { usageCount, lastSeenNewsVersion } = useAppStateStore(state => ({
usageCount: state.usageCount,
lastSeenNewsVersion: state.lastSeenNewsVersion,
@@ -17,9 +16,9 @@ export function useShowNewsOnUpdate() {
const isNewsOutdated = (lastSeenNewsVersion || 0) < incrementalVersion;
if (isNewsOutdated && usageCount > 2) {
// Disable for now
void routerPush('/news');
void navigateToNews();
}
}, [lastSeenNewsVersion, routerPush, usageCount]);
}, [lastSeenNewsVersion, usageCount]);
}
export function useMarkNewsAsSeen() {
+2 -2
View File
@@ -1,7 +1,7 @@
import * as React from 'react';
import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
export interface LLMChainStep {
@@ -80,7 +80,7 @@ export function useLLMChain(steps: LLMChainStep[], llmId: DLLMId | undefined, ch
_chainAbortController.signal.addEventListener('abort', globalToStepListener);
// LLM call
callChatGenerate(llmId, llmChatInput, chain.overrideResponseTokens)
llmChatGenerateOrThrow(llmId, llmChatInput, null, null, chain.overrideResponseTokens)
.then(({ content }) => {
stepDone = true;
if (!stepAbortController.signal.aborted)
+28 -3
View File
@@ -7,21 +7,37 @@
import Router from 'next/router';
import type { DConversationId } from '~/common/state/store-chats';
import { isBrowser } from './util/pwaUtils';
export const ROUTE_INDEX = '/';
export const ROUTE_APP_CHAT = '/';
export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
export const ROUTE_APP_NEWS = '/news';
const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';
export const getIndexLink = () => ROUTE_INDEX;
// Get Paths
export const getCallbackUrl = (source: 'openrouter') => {
const callbackUrl = new URL(window.location.href);
switch (source) {
case 'openrouter':
callbackUrl.pathname = ROUTE_CALLBACK_OPENROUTER;
break;
default:
throw new Error(`Unknown source: ${source}`);
}
return callbackUrl.toString();
};
export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);
const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
Router[replace ? 'replace' : 'push'](path);
/// Simple Navigation
export const navigateToIndex = navigateFn(ROUTE_INDEX);
export const navigateToChat = async (conversationId?: DConversationId) => {
if (conversationId) {
await Router.push(
@@ -41,6 +57,15 @@ export const navigateToNews = navigateFn(ROUTE_APP_NEWS);
export const navigateBack = Router.back;
export const reloadPage = () => isBrowser && window.location.reload();
function navigateFn(path: string) {
return (replace?: boolean): Promise<boolean> => Router[replace ? 'replace' : 'push'](path);
}
/// Launch Apps
export interface AppCallQueryParams {
conversationId: string;
personaId: string;
+1 -1
View File
@@ -23,7 +23,7 @@ export function GoodModal(props: {
const showBottomClose = !!props.onClose && props.hideBottomClose !== true;
return (
<Modal open={props.open} onClose={props.onClose}>
<ModalOverflow>
<ModalOverflow sx={{p:1}}>
<ModalDialog
sx={{
minWidth: { xs: 360, sm: 500, md: 600, lg: 700 },
@@ -0,0 +1,10 @@
import * as React from 'react';
import { SvgIcon } from '@mui/joy';
import { SxProps } from '@mui/joy/styles/types';
export function MistralIcon(props: { sx?: SxProps }) {
return <SvgIcon viewBox='0 0 24 24' width='24' height='24' strokeWidth={0} stroke='none' fill='currentColor' strokeLinecap='butt' strokeLinejoin='miter' {...props}>
<path d='m 2,2 v 4 4 V 14 v 4 4 h 4 v -4 -4 h 4 v 4 h 4 v -4 h 4 v 4 4 h 4 v -4 -4 -4 -4 V 2 h -4 v 4 h -4 v 4 h -4 v -4 H 6 V 2 Z' />
</SvgIcon>;
}
+9 -3
View File
@@ -21,8 +21,13 @@ export const useGlobalShortcut = (shortcutKey: string | false, useCtrl: boolean,
if (!shortcutKey) return;
const lcShortcut = shortcutKey.toLowerCase();
const handleKeyDown = (event: KeyboardEvent) => {
if ((useCtrl === event.ctrlKey) && (useShift === event.shiftKey) && (useAlt === event.altKey)
&& event.key.toLowerCase() === lcShortcut) {
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
if (
(useCtrl === isCtrlOrCmd) &&
(useShift === event.shiftKey) &&
(useAlt === event.altKey) &&
event.key.toLowerCase() === lcShortcut
) {
event.preventDefault();
event.stopPropagation();
callback();
@@ -46,9 +51,10 @@ export const useGlobalShortcuts = (shortcuts: GlobalShortcutItem[]) => {
React.useEffect(() => {
const handleKeyDown = (event: KeyboardEvent) => {
for (const [key, useCtrl, useShift, useAlt, action] of shortcuts) {
const isCtrlOrCmd = (event.ctrlKey && !event.metaKey) || (event.metaKey && !event.ctrlKey);
if (
key &&
(useCtrl === event.ctrlKey) &&
(useCtrl === isCtrlOrCmd) &&
(useShift === event.shiftKey) &&
(useAlt === event.altKey) &&
event.key.toLowerCase() === key.toLowerCase()
@@ -0,0 +1,95 @@
import * as React from 'react';
/**
* The AloneDetector class checks if the current client is the only one present for a given app. It uses
* BroadcastChannel to talk to other clients. If no other clients reply within a short time, it assumes it's
* the only one and tells the caller.
*/
class AloneDetector {
private readonly clientId: string;
private readonly broadcastChannel: BroadcastChannel;
private aloneCallback: ((isAlone: boolean) => void) | null;
private aloneTimerId: number | undefined;
constructor(channelName: string, onAlone: (isAlone: boolean) => void) {
this.clientId = Math.random().toString(36).substring(2, 10);
this.aloneCallback = onAlone;
this.broadcastChannel = new BroadcastChannel(channelName);
this.broadcastChannel.onmessage = this.handleIncomingMessage;
}
public onUnmount(): void {
// close channel
this.broadcastChannel.onmessage = null;
this.broadcastChannel.close();
// clear timeout
if (this.aloneTimerId)
clearTimeout(this.aloneTimerId);
this.aloneTimerId = undefined;
this.aloneCallback = null;
}
public checkIfAlone(): void {
// triggers other clients
this.broadcastChannel.postMessage({ type: 'CHECK', sender: this.clientId });
// if no response within 500ms, assume this client is alone
this.aloneTimerId = window.setTimeout(() => {
this.aloneTimerId = undefined;
this.aloneCallback?.(true);
}, 500);
}
private handleIncomingMessage = (event: MessageEvent): void => {
// ignore self messages
if (event.data.sender === this.clientId) return;
switch (event.data.type) {
case 'CHECK':
this.broadcastChannel.postMessage({ type: 'ALIVE', sender: this.clientId });
break;
case 'ALIVE':
// received an ALIVE message, tell the client they're not alone
if (this.aloneTimerId) {
clearTimeout(this.aloneTimerId);
this.aloneTimerId = undefined;
}
this.aloneCallback?.(false);
this.aloneCallback = null;
break;
}
};
}
/**
* React hook that checks whether the current tab is the only one open for a specific channel.
*
* @param {string} channelName - The name of the BroadcastChannel to communicate on.
* @returns {boolean | null} - True if the current tab is alone, false if not, or null before the check completes.
*/
export function useSingleTabEnforcer(channelName: string): boolean | null {
const [isAlone, setIsAlone] = React.useState<boolean | null>(null);
React.useEffect(() => {
const tabManager = new AloneDetector(channelName, setIsAlone);
tabManager.checkIfAlone();
return () => {
tabManager.onUnmount();
};
}, [channelName]);
return isAlone;
}
+14 -8
View File
@@ -9,6 +9,7 @@ export type DropdownItems = Record<string, {
title: string,
symbol?: string,
type?: 'separator'
icon?: React.ReactNode,
}>;
@@ -71,20 +72,25 @@ export function AppBarDropdown<TValue extends string>(props: {
{!!props.prependOption && Object.keys(props.items).length >= 1 && <Divider />}
<Box sx={{ overflowY: 'auto' }}>
{Object.keys(props.items).map((key: string, idx: number) => <React.Fragment key={'key-' + idx}>
{props.items[key].type === 'separator'
? <ListDivider />
: <Option value={key} sx={{ whiteSpace: 'nowrap' }}>
{props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{props.items[key]?.symbol + ' '}</ListItemDecorator>}
{props.items[key].title}
{Object.keys(props.items).map((key: string, idx: number) => {
const item = props.items[key];
if (item.type === 'separator')
return <ListDivider key={'key-' + idx} />;
return (
<Option key={'key-' + idx} value={key} sx={{ whiteSpace: 'nowrap' }}>
{props.showSymbols && <ListItemDecorator sx={{ fontSize: 'xl' }}>{item?.symbol + ' '}</ListItemDecorator>}
{props.showSymbols && !!item.icon && <ListItemDecorator>{item?.icon}</ListItemDecorator>}
{item.title}
{/*{key === props.value && (*/}
{/* <IconButton variant='soft' onClick={() => alert('aa')} sx={{ ml: 'auto' }}>*/}
{/* <SettingsIcon color='success' />*/}
{/* </IconButton>*/}
{/*)}*/}
</Option>
}
</React.Fragment>)}
);
})}
</Box>
{!!props.appendOption && Object.keys(props.items).length >= 1 && <ListDivider />}
+1 -1
View File
@@ -3,7 +3,7 @@ import { shallow } from 'zustand/shallow';
import { Box, Container } from '@mui/joy';
import { ModelsModal } from '../../apps/models-modal/ModelsModal';
import { ModelsModal } from '~/modules/llms/models-modal/ModelsModal';
import { SettingsModal } from '../../apps/settings-modal/SettingsModal';
import { ShortcutsModal } from '../../apps/settings-modal/ShortcutsModal';
+42
View File
@@ -0,0 +1,42 @@
import * as React from 'react';
import { Button, Sheet, Typography } from '@mui/joy';
import { Brand } from '../app.config';
import { reloadPage } from '../app.routes';
import { useSingleTabEnforcer } from '../components/useSingleTabEnforcer';
export const ProviderSingleTab = (props: { children: React.ReactNode }) => {
// state
const isSingleTab = useSingleTabEnforcer('big-agi-tabs');
// pass-through until we know for sure that other tabs are open
if (isSingleTab === null || isSingleTab)
return props.children;
return (
<Sheet
variant='solid'
invertedColors
sx={{
flexGrow: 1,
display: 'flex', flexDirection: { xs: 'column', md: 'row' }, justifyContent: 'center', alignItems: 'center', gap: 2,
p: 3,
}}
>
<Typography>
It looks like {Brand.Title.Base} is already running in another tab or window.
To continue here, please close the other instance first.
</Typography>
<Button onClick={reloadPage}>
Reload
</Button>
</Sheet>
);
};
+40 -5
View File
@@ -2,11 +2,13 @@
* @fileoverview Utility functions for Markdown.
*/
import { isBrowser } from '~/common/util/pwaUtils';
/**
* Quick and dirty conversion of HTML tables to Markdown tables.
* Big plus: doesn't require any dependencies.
*/
export function htmlTableToMarkdown(html: string): string {
export function htmlTableToMarkdown(html: string, includeInvisible: boolean): string {
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');
const table = doc.querySelector('table');
@@ -16,20 +18,53 @@ export function htmlTableToMarkdown(html: string): string {
const headerCells = table.querySelectorAll('thead th');
if (headerCells.length > 0) {
const headerRow = '| ' + Array.from(headerCells)
.map(cell => cell.textContent?.trim() || '')
.join(' | ') + '| ';
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
.join(' | ') + ' |';
markdownRows.push(headerRow);
markdownRows.push('|:' + Array(headerCells.length).fill('-').join('|:') + '|');
markdownRows.push('|:' + Array(headerCells.length).fill('---').join('|:') + '|');
}
const bodyRows = table.querySelectorAll('tbody tr');
for (const row of Array.from(bodyRows)) {
const rowCells = row.querySelectorAll('td');
const markdownRow = '| ' + Array.from(rowCells)
.map(cell => cell.textContent?.trim() || '')
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
.join(' | ') + ' |';
markdownRows.push(markdownRow);
}
return markdownRows.join('\n');
}
// Helper function to get text with spaces, ignoring hidden elements
function getTextWithSpaces(node: Node, includeInvisible: boolean): string {
let text = '';
node.childNodes.forEach(child => {
if (child.nodeType === Node.TEXT_NODE)
text += child.textContent;
else if (child.nodeType === Node.ELEMENT_NODE)
if (includeInvisible || isVisible(child as Element))
text += ' ' + getTextWithSpaces(child, includeInvisible) + ' ';
});
return text;
}
// Helper function to determine if an element is visible
function isVisible(element: Element): boolean {
if (!isBrowser) return true;
// if the cell is hidden, don't include it
const style = window.getComputedStyle(element);
if (style.display === 'none' || style.visibility === 'hidden')
return false;
// Check for common classes used to hide content or indicate tooltip/popover content.
// You may need to add more classes here based on your actual HTML/CSS.
const ignoredClasses = ['hidden', 'group-hover', 'tooltip', 'pointer-events-none', 'opacity-0'];
for (const ignoredClass of ignoredClasses)
if (element.classList.contains(ignoredClass))
return false;
// Otherwise, the element is considered visible
return true;
}
+1 -1
View File
@@ -14,7 +14,7 @@ export async function pdfToText(pdfBuffer: ArrayBuffer): Promise<string> {
const { getDocument, GlobalWorkerOptions } = await import('pdfjs-dist');
// Set the worker script path
GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.js';
GlobalWorkerOptions.workerSrc = '/workers/pdf.worker.min.mjs';
const pdf = await getDocument(pdfBuffer).promise;
const textPages: string[] = []; // Initialize an array to hold text from all pages
@@ -1,4 +1,4 @@
import { callChatGenerateWithFunctions, VChatFunctionIn } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow, VChatFunctionIn } from '~/modules/llms/llm.client';
import { useModelsStore } from '~/modules/llms/store-llms';
import { useChatStore } from '~/common/state/store-chats';
@@ -71,7 +71,7 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
// Follow-up: Question
if (suggestQuestions) {
// callChatGenerateWithFunctions(funcLLMId, [
// llmChatGenerateOrThrow(funcLLMId, [
// { role: 'system', content: systemMessage.text },
// { role: 'user', content: userMessage.text },
// { role: 'assistant', content: assistantMessageText },
@@ -83,15 +83,18 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
// Follow-up: Auto-Diagrams
if (suggestDiagrams) {
void callChatGenerateWithFunctions(funcLLMId, [
void llmChatGenerateOrThrow(funcLLMId, [
{ role: 'system', content: systemMessage.text },
{ role: 'user', content: userMessage.text },
{ role: 'assistant', content: assistantMessageText },
], [suggestPlantUMLFn], 'draw_plantuml_diagram',
).then(chatResponse => {
if (!('function_arguments' in chatResponse))
return;
// parse the output PlantUML string, if any
const functionArguments = chatResponse?.function_arguments ?? null;
const functionArguments = chatResponse.function_arguments ?? null;
if (functionArguments) {
const { code, type }: { code: string, type: string } = functionArguments as any;
if (code && type) {
@@ -105,6 +108,8 @@ export function autoSuggestions(conversationId: string, assistantMessageId: stri
editMessage(conversationId, assistantMessageId, { text: assistantMessageText }, false);
}
}
}).catch(err => {
console.error('autoSuggestions::diagram:', err);
});
}
+3 -3
View File
@@ -1,4 +1,4 @@
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
import { useModelsStore } from '~/modules/llms/store-llms';
import { useChatStore } from '~/common/state/store-chats';
@@ -27,7 +27,7 @@ export function autoTitle(conversationId: string) {
});
// LLM
void callChatGenerate(fastLLMId, [
void llmChatGenerateOrThrow(fastLLMId, [
{ role: 'system', content: `You are an AI conversation titles assistant who specializes in creating expressive yet few-words chat titles.` },
{
role: 'user', content:
@@ -39,7 +39,7 @@ export function autoTitle(conversationId: string) {
historyLines.join('\n') +
'```\n',
},
]).then(chatResponse => {
], null, null).then(chatResponse => {
const title = chatResponse?.content
?.trim()
+13 -5
View File
@@ -1,6 +1,6 @@
import * as React from 'react';
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton } from '@mui/joy';
import { Box, Button, ButtonGroup, CircularProgress, Divider, Grid, IconButton, Input, FormControl, FormLabel } from '@mui/joy';
import AccountTreeIcon from '@mui/icons-material/AccountTree';
import ExpandLessIcon from '@mui/icons-material/ExpandLess';
import ExpandMoreIcon from '@mui/icons-material/ExpandMore';
@@ -8,8 +8,9 @@ import ReplayIcon from '@mui/icons-material/Replay';
import StopOutlinedIcon from '@mui/icons-material/StopOutlined';
import TelegramIcon from '@mui/icons-material/Telegram';
import { llmStreamingChatGenerate } from '~/modules/llms/llm.client';
import { ChatMessage } from '../../../apps/chat/components/message/ChatMessage';
import { streamChat } from '~/modules/llms/transports/streamChat';
import { GoodModal } from '~/common/components/GoodModal';
import { InlineError } from '~/common/components/InlineError';
@@ -48,6 +49,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
const [message, setMessage] = React.useState<DMessage | null>(null);
const [diagramType, diagramComponent] = useFormRadio<DiagramType>('auto', diagramTypes, 'Visualize');
const [diagramLanguage, languageComponent] = useFormRadio<DiagramLanguage>('plantuml', diagramLanguages, 'Style');
const [customInstruction, setCustomInstruction] = React.useState<string>('');
const [errorMessage, setErrorMessage] = React.useState<string | null>(null);
const [abortController, setAbortController] = React.useState<AbortController | null>(null);
@@ -81,10 +83,10 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
const stepAbortController = new AbortController();
setAbortController(stepAbortController);
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject);
const diagramPrompt = bigDiagramPrompt(diagramType, diagramLanguage, systemMessage.text, subject, customInstruction);
try {
await streamChat(diagramLlm.id, diagramPrompt, stepAbortController.signal,
await llmStreamingChatGenerate(diagramLlm.id, diagramPrompt, null, null, stepAbortController.signal,
(update: Partial<{ text: string, typing: boolean, originLLM: string }>) => {
assistantMessage = { ...assistantMessage, ...update };
setMessage(assistantMessage);
@@ -103,7 +105,7 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
setAbortController(null);
}
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject]);
}, [abortController, conversationId, diagramLanguage, diagramLlm, diagramType, subject, customInstruction]);
// [Effect] Auto-abort on unmount
@@ -149,6 +151,12 @@ export function DiagramsModal(props: { config: DiagramConfig, onClose: () => voi
<Grid xs={12} xl={6}>
{llmComponent}
</Grid>
<Grid xs={12} md={6}>
<FormControl>
<FormLabel>Custom Instruction</FormLabel>
<Input title="Custom Instruction" placeholder='e.g. visualize as state' value={customInstruction} onChange={(e) => setCustomInstruction(e.target.value)} />
</FormControl>
</Grid>
</Grid>
)}
+6 -4
View File
@@ -1,6 +1,5 @@
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import type { FormRadioOption } from '~/common/components/forms/FormRadioControl';
import type { VChatMessageIn } from '~/modules/llms/llm.client';
export type DiagramType = 'auto' | 'mind';
@@ -60,12 +59,15 @@ function plantumlDiagramPrompt(diagramType: DiagramType): { sys: string, usr: st
}
}
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string): VChatMessageIn[] {
export function bigDiagramPrompt(diagramType: DiagramType, diagramLanguage: DiagramLanguage, chatSystemPrompt: string, subject: string, customInstruction: string): VChatMessageIn[] {
const { sys, usr } = diagramLanguage === 'mermaid' ? mermaidDiagramPrompt(diagramType) : plantumlDiagramPrompt(diagramType);
if (customInstruction) {
customInstruction = 'Also consider the following instructions: ' + customInstruction;
}
return [
{ role: 'system', content: sys },
{ role: 'system', content: chatSystemPrompt },
{ role: 'assistant', content: subject },
{ role: 'user', content: usr },
{ role: 'user', content: `${usr} ${customInstruction}` },
];
}
@@ -1,4 +1,4 @@
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
import { useModelsStore } from '~/modules/llms/store-llms';
@@ -14,10 +14,10 @@ export async function imaginePromptFromText(messageText: string): Promise<string
const { fastLLMId } = useModelsStore.getState();
if (!fastLLMId) return null;
try {
const chatResponse = await callChatGenerate(fastLLMId, [
const chatResponse = await llmChatGenerateOrThrow(fastLLMId, [
{ role: 'system', content: simpleImagineSystemPrompt },
{ role: 'user', content: 'Write a prompt, based on the following input.\n\n```\n' + messageText.slice(0, 1000) + '\n```\n' },
]);
], null, null);
return chatResponse.content?.trim() ?? null;
} catch (error: any) {
console.error('imaginePromptFromText: fetch request error:', error);
+2 -2
View File
@@ -5,7 +5,7 @@
import { DLLMId } from '~/modules/llms/store-llms';
import { callApiSearchGoogle } from '~/modules/google/search.client';
import { callBrowseFetchPage } from '~/modules/browse/browse.client';
import { callChatGenerate, VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow, VChatMessageIn } from '~/modules/llms/llm.client';
// prompt to implement the ReAct paradigm: https://arxiv.org/abs/2210.03629
@@ -128,7 +128,7 @@ export class Agent {
S.messages.push({ role: 'user', content: prompt });
let content: string;
try {
content = (await callChatGenerate(llmId, S.messages, 500)).content;
content = (await llmChatGenerateOrThrow(llmId, S.messages, null, null, 500)).content;
} catch (error: any) {
content = `Error in callChat: ${error}`;
}
+3 -3
View File
@@ -1,5 +1,5 @@
import { DLLMId, findLLMOrThrow } from '~/modules/llms/store-llms';
import { callChatGenerate } from '~/modules/llms/transports/chatGenerate';
import { llmChatGenerateOrThrow } from '~/modules/llms/llm.client';
// prompt to be tried when doing recursive summerization.
@@ -80,10 +80,10 @@ async function cleanUpContent(chunk: string, llmId: DLLMId, _ignored_was_targetW
const autoResponseTokensSize = Math.floor(contextTokens * outputTokenShare);
try {
const chatResponse = await callChatGenerate(llmId, [
const chatResponse = await llmChatGenerateOrThrow(llmId, [
{ role: 'system', content: cleanupPrompt },
{ role: 'user', content: chunk },
], autoResponseTokensSize);
], null, null, autoResponseTokensSize);
return chatResponse?.content ?? '';
} catch (error: any) {
return '';
+2 -3
View File
@@ -1,8 +1,7 @@
import * as React from 'react';
import type { DLLMId } from '~/modules/llms/store-llms';
import type { VChatMessageIn } from '~/modules/llms/transports/chatGenerate';
import { streamChat } from '~/modules/llms/transports/streamChat';
import { llmStreamingChatGenerate, VChatMessageIn } from '~/modules/llms/llm.client';
export function useStreamChatText() {
@@ -25,7 +24,7 @@ export function useStreamChatText() {
try {
let lastText = '';
await streamChat(llmId, prompt, abortControllerRef.current.signal, (update) => {
await llmStreamingChatGenerate(llmId, prompt, null, null, abortControllerRef.current.signal, (update) => {
if (update.text) {
lastText = update.text;
setPartialText(lastText);
+21 -1
View File
@@ -1,5 +1,10 @@
import { z } from 'zod';
import type { BackendCapabilities } from '~/modules/backend/state-backend';
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
import { env } from '~/server/env.mjs';
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
import { analyticsListCapabilities } from './backend.analytics';
@@ -23,11 +28,26 @@ export const backendRouter = createTRPCRouter({
hasImagingProdia: !!env.PRODIA_API_KEY,
hasLlmAnthropic: !!env.ANTHROPIC_API_KEY,
hasLlmAzureOpenAI: !!env.AZURE_OPENAI_API_KEY && !!env.AZURE_OPENAI_API_ENDPOINT,
hasLlmGemini: !!env.GEMINI_API_KEY,
hasLlmMistral: !!env.MISTRAL_API_KEY,
hasLlmOllama: !!env.OLLAMA_API_HOST,
hasLlmOpenAI: !!env.OPENAI_API_KEY || !!env.OPENAI_API_HOST,
hasLlmOpenRouter: !!env.OPENROUTER_API_KEY,
hasVoiceElevenLabs: !!env.ELEVENLABS_API_KEY,
};
} satisfies BackendCapabilities;
}),
// The following are used for various OAuth integrations
/* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
exchangeOpenRouterKey: publicProcedure
.input(z.object({ code: z.string() }))
.query(async ({ input }) => {
// Documented here: https://openrouter.ai/docs#oauth
return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
code: input.code,
}, 'Backend.exchangeOpenRouterKey');
}),
});
+4
View File
@@ -9,6 +9,8 @@ export interface BackendCapabilities {
hasImagingProdia: boolean;
hasLlmAnthropic: boolean;
hasLlmAzureOpenAI: boolean;
hasLlmGemini: boolean;
hasLlmMistral: boolean;
hasLlmOllama: boolean;
hasLlmOpenAI: boolean;
hasLlmOpenRouter: boolean;
@@ -30,6 +32,8 @@ const useBackendStore = create<BackendStore>()(
hasImagingProdia: false,
hasLlmAnthropic: false,
hasLlmAzureOpenAI: false,
hasLlmGemini: false,
hasLlmMistral: false,
hasLlmOllama: false,
hasLlmOpenAI: false,
hasLlmOpenRouter: false,
+1 -1
View File
@@ -1,4 +1,4 @@
import create from 'zustand';
import { create } from 'zustand';
import { persist } from 'zustand/middleware';
import { CapabilityBrowsing } from '~/common/components/useCapabilities';
+74
View File
@@ -0,0 +1,74 @@
import type { DLLMId } from './store-llms';
import type { OpenAIWire } from './server/openai/openai.wiretypes';
import { findVendorForLlmOrThrow } from './vendors/vendors.registry';
// LLM Client Types
// NOTE: Model List types in '../server/llm.server.types';
export interface VChatMessageIn {
role: 'assistant' | 'system' | 'user'; // | 'function';
content: string;
//name?: string; // when role: 'function'
}
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
export interface VChatMessageOut {
role: 'assistant' | 'system' | 'user';
content: string;
finish_reason: 'stop' | 'length' | null;
}
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
function_name: string;
function_arguments: object | null;
}
// LLM Client Functions
export async function llmChatGenerateOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
llmId: DLLMId,
messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
maxTokens?: number,
): Promise<VChatMessageOut | VChatMessageOrFunctionCallOut> {
// id to DLLM and vendor
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
// FIXME: relax the forced cast
const options = llm.options as TLLMOptions;
// get the access
const partialSourceSetup = llm._source.setup;
const access = vendor.getTransportAccess(partialSourceSetup);
// execute via the vendor
return await vendor.rpcChatGenerateOrThrow(access, options, messages, functions, forceFunctionName, maxTokens);
}
export async function llmStreamingChatGenerate<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
llmId: DLLMId,
messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null,
forceFunctionName: string | null,
abortSignal: AbortSignal,
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
): Promise<void> {
// id to DLLM and vendor
const { llm, vendor } = findVendorForLlmOrThrow<TSourceSetup, TAccess, TLLMOptions>(llmId);
// FIXME: relax the forced cast
const llmOptions = llm.options as TLLMOptions;
// get the access
const partialSourceSetup = llm._source.setup;
const access = vendor.getTransportAccess(partialSourceSetup); // as ChatStreamInputSchema['access'];
// execute via the vendor
return await vendor.streamingChatGenerateOrThrow(access, llmId, llmOptions, messages, functions, forceFunctionName, abortSignal, onUpdate);
}
@@ -7,7 +7,7 @@ import VisibilityIcon from '@mui/icons-material/Visibility';
import VisibilityOffIcon from '@mui/icons-material/VisibilityOff';
import { DLLMId, useModelsStore } from '~/modules/llms/store-llms';
import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
import { findVendorById } from '~/modules/llms/vendors/vendors.registry';
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
import { GoodModal } from '~/common/components/GoodModal';
@@ -117,9 +117,9 @@ export function LLMOptionsModal(props: { id: DLLMId }) {
<FormLabelStart title='Details' sx={{ minWidth: 80 }} onClick={() => setShowDetails(!showDetails)} />
{showDetails && <Typography level='body-sm' sx={{ display: 'block' }}>
[{llm.id}]: {llm.options.llmRef && `${llm.options.llmRef} · `}
{llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
{llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
{llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
{!!llm.contextTokens && `context tokens: ${llm.contextTokens.toLocaleString()} · `}
{!!llm.maxOutputTokens && `max output tokens: ${llm.maxOutputTokens.toLocaleString()} · `}
{!!llm.created && `created: ${(new Date(llm.created * 1000)).toLocaleString()} · `}
description: {llm.description}
{/*· tags: {llm.tags.join(', ')}*/}
</Typography>}
@@ -7,7 +7,7 @@ import VisibilityOffOutlinedIcon from '@mui/icons-material/VisibilityOffOutlined
import { DLLM, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
import { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
import { findVendorById } from '~/modules/llms/vendors/vendor.registry';
import { findVendorById } from '~/modules/llms/vendors/vendors.registry';
import { GoodTooltip } from '~/common/components/GoodTooltip';
import { openLayoutLLMOptions } from '~/common/layout/store-applayout';
@@ -109,8 +109,15 @@ export function ModelsList(props: {
<List variant='soft' size='sm' sx={{
borderRadius: 'sm',
pl: { xs: 0, md: 1 },
overflowY: 'auto',
}}>
{items}
{items.length > 0 ? items : (
<ListItem>
<Typography level='body-sm'>
Please configure the service and update the list of models.
</Typography>
</ListItem>
)}
</List>
);
}
@@ -4,7 +4,7 @@ import { shallow } from 'zustand/shallow';
import { Box, Checkbox, Divider } from '@mui/joy';
import { DModelSource, DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendor.registry';
import { createModelSourceForDefaultVendor, findVendorById } from '~/modules/llms/vendors/vendors.registry';
import { GoodModal } from '~/common/components/GoodModal';
import { closeLayoutModelsSetup, openLayoutModelsSetup, useLayoutModelsSetup } from '~/common/layout/store-applayout';
@@ -65,7 +65,7 @@ export function ModelsModal(props: { suspendAutoModelsSetup?: boolean }) {
title={<>Configure <b>AI Models</b></>}
startButton={
multiSource ? <Checkbox
label='all vendors' sx={{ my: 'auto' }}
label='All Services' sx={{ my: 'auto' }}
checked={showAllSources} onChange={() => setShowAllSources(all => !all)}
/> : undefined
}
@@ -5,9 +5,9 @@ import { Avatar, Badge, Box, Button, IconButton, ListItemDecorator, MenuItem, Op
import AddIcon from '@mui/icons-material/Add';
import DeleteOutlineIcon from '@mui/icons-material/DeleteOutline';
import { type DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
import { type IModelVendor, type ModelVendorId } from '~/modules/llms/vendors/IModelVendor';
import { createModelSourceForVendor, findAllVendors, findVendorById } from '~/modules/llms/vendors/vendor.registry';
import type { IModelVendor } from '~/modules/llms/vendors/IModelVendor';
import { DModelSourceId, useModelsStore } from '~/modules/llms/store-llms';
import { createModelSourceForVendor, findAllVendors, findVendorById, ModelVendorId } from '~/modules/llms/vendors/vendors.registry';
import { CloseableMenu } from '~/common/components/CloseableMenu';
import { ConfirmationModal } from '~/common/components/ConfirmationModal';
@@ -29,7 +29,7 @@ function vendorIcon(vendor: IModelVendor | null, greenMark: boolean) {
icon = <vendor.Icon />;
}
return (greenMark && icon)
? <Badge color='primary' size='sm' badgeContent=''>{icon}</Badge>
? <Badge color='success' size='sm' badgeContent=''>{icon}</Badge>
: icon;
}
@@ -92,7 +92,11 @@ export function ModelsSourceSelector(props: {
<ListItemDecorator>
{vendorIcon(vendor, !!vendor.hasBackendCap && vendor.hasBackendCap())}
</ListItemDecorator>
{vendor.name}{/*{sourceCount > 0 && ` (added)`}*/}
{vendor.name}
{/*{sourceCount > 0 && ` (added)`}*/}
{!!vendor.hasFreeModels && ` 🎁`}
{/*{!!vendor.instanceLimit && ` (${sourceCount}/${vendor.instanceLimit})`}*/}
{vendor.location === 'local' && <span style={{ opacity: 0.5 }}>local</span>}
</MenuItem>
),
};
@@ -1,6 +1,6 @@
import type { ModelDescriptionSchema } from '../server.schemas';
import type { ModelDescriptionSchema } from '../llm.server.types';
import { LLM_IF_OAI_Chat } from '../../../store-llms';
import { LLM_IF_OAI_Chat } from '../../store-llms';
const roundTime = (date: string) => Math.round(new Date(date).getTime() / 1000);
@@ -6,7 +6,7 @@ import { env } from '~/server/env.mjs';
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
import { listModelsOutputSchema } from '../server.schemas';
import { listModelsOutputSchema } from '../llm.server.types';
import { AnthropicWire } from './anthropic.wiretypes';
import { hardcodedAnthropicModels } from './anthropic.models';
@@ -0,0 +1,216 @@
import { z } from 'zod';
import { TRPCError } from '@trpc/server';
import { env } from '~/server/env.mjs';
import packageJson from '../../../../../package.json';
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Vision } from '../../store-llms';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
import { GeminiBlockSafetyLevel, geminiBlockSafetyLevelSchema, GeminiContentSchema, GeminiGenerateContentRequest, geminiGeneratedContentResponseSchema, geminiModelsGenerateContentPath, geminiModelsListOutputSchema, geminiModelsListPath } from './gemini.wiretypes';
// Default hosts
const DEFAULT_GEMINI_HOST = 'https://generativelanguage.googleapis.com';
// Mappers
export function geminiAccess(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string): { headers: HeadersInit, url: string } {
const geminiKey = access.geminiKey || env.GEMINI_API_KEY || '';
const geminiHost = fixupHost(DEFAULT_GEMINI_HOST, apiPath);
// update model-dependent paths
if (apiPath.includes('{model=models/*}')) {
if (!modelRefId)
throw new Error(`geminiAccess: modelRefId is required for ${apiPath}`);
apiPath = apiPath.replace('{model=models/*}', modelRefId);
}
return {
headers: {
'Content-Type': 'application/json',
'x-goog-api-client': `big-agi/${packageJson['version'] || '1.0.0'}`,
'x-goog-api-key': geminiKey,
},
url: geminiHost + apiPath,
};
}
/**
* We specially encode the history to match the Gemini API requirements.
* Gemini does not want 2 consecutive messages from the same role, so we alternate.
* - System messages = [User, Model'Ok']
* - User and Assistant messages are coalesced into a single message (e.g. [User, User, Assistant, Assistant, User] -> [User[2], Assistant[2], User[1]])
*/
export const geminiGenerateContentTextPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, safety: GeminiBlockSafetyLevel, n: number): GeminiGenerateContentRequest => {
// convert the history to a Gemini format
const contents: GeminiContentSchema[] = [];
for (const _historyElement of history) {
const { role: msgRole, content: msgContent } = _historyElement;
// System message - we treat it as per the example in https://ai.google.dev/tutorials/ai-studio_quickstart#chat_example
if (msgRole === 'system') {
contents.push({ role: 'user', parts: [{ text: msgContent }] });
contents.push({ role: 'model', parts: [{ text: 'Ok' }] });
continue;
}
// User or Assistant message
const nextRole: GeminiContentSchema['role'] = msgRole === 'assistant' ? 'model' : 'user';
if (contents.length && contents[contents.length - 1].role === nextRole) {
// coalesce with the previous message
contents[contents.length - 1].parts.push({ text: msgContent });
} else {
// create a new message
contents.push({ role: nextRole, parts: [{ text: msgContent }] });
}
}
return {
contents,
generationConfig: {
...(n >= 2 && { candidateCount: n }),
...(model.maxTokens && { maxOutputTokens: model.maxTokens }),
temperature: model.temperature,
},
safetySettings: safety !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED' ? [
{ category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: safety },
{ category: 'HARM_CATEGORY_HATE_SPEECH', threshold: safety },
{ category: 'HARM_CATEGORY_HARASSMENT', threshold: safety },
{ category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: safety },
] : undefined,
};
};
async function geminiGET<TOut extends object>(access: GeminiAccessSchema, modelRefId: string | null, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
return await fetchJsonOrTRPCError<TOut>(url, 'GET', headers, undefined, 'Gemini');
}
async function geminiPOST<TOut extends object, TPostBody extends object>(access: GeminiAccessSchema, modelRefId: string | null, body: TPostBody, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
const { headers, url } = geminiAccess(access, modelRefId, apiPath);
return await fetchJsonOrTRPCError<TOut, TPostBody>(url, 'POST', headers, body, 'Gemini');
}
// Input/Output Schemas
export const geminiAccessSchema = z.object({
dialect: z.enum(['gemini']),
geminiKey: z.string(),
minSafetyLevel: geminiBlockSafetyLevelSchema,
});
export type GeminiAccessSchema = z.infer<typeof geminiAccessSchema>;
const accessOnlySchema = z.object({
access: geminiAccessSchema,
});
const chatGenerateInputSchema = z.object({
access: geminiAccessSchema,
model: openAIModelSchema, history: openAIHistorySchema,
// functions: openAIFunctionsSchema.optional(), forceFunctionName: z.string().optional(),
});
/**
* See https://github.com/google/generative-ai-js/tree/main/packages/main/src for
* the official Google implementation.
*/
export const llmGeminiRouter = createTRPCRouter({
/* [Gemini] models.list = /v1beta/models */
listModels: publicProcedure
.input(accessOnlySchema)
.output(listModelsOutputSchema)
.query(async ({ input }) => {
// get the models
const wireModels = await geminiGET(input.access, null, geminiModelsListPath);
const detailedModels = geminiModelsListOutputSchema.parse(wireModels).models;
// NOTE: no need to retrieve info for each of the models (e.g. /v1beta/model/gemini-pro).,
// as the List API already all the info on all the models
// map to our output schema
return {
models: detailedModels.map((geminiModel) => {
const { description, displayName, inputTokenLimit, name, outputTokenLimit, supportedGenerationMethods } = geminiModel;
const contextWindow = inputTokenLimit + outputTokenLimit;
const hidden = !supportedGenerationMethods.includes('generateContent');
const { version, topK, topP, temperature } = geminiModel;
const descriptionLong = description + ` (Version: ${version}, Defaults: temperature=${temperature}, topP=${topP}, topK=${topK}, interfaces=[${supportedGenerationMethods.join(',')}])`;
// const isGeminiPro = name.includes('gemini-pro');
const isGeminiProVision = name.includes('gemini-pro-vision');
const interfaces: ModelDescriptionSchema['interfaces'] = [];
if (supportedGenerationMethods.includes('generateContent')) {
interfaces.push(LLM_IF_OAI_Chat);
if (isGeminiProVision)
interfaces.push(LLM_IF_OAI_Vision);
}
return {
id: name,
label: displayName,
// created: ...
// updated: ...
description: descriptionLong,
contextWindow: contextWindow,
maxCompletionTokens: outputTokenLimit,
// pricing: isGeminiPro ? { needs per-character and per-image pricing } : undefined,
// rateLimits: isGeminiPro ? { reqPerMinute: 60 } : undefined,
interfaces: supportedGenerationMethods.includes('generateContent') ? [LLM_IF_OAI_Chat] : [],
hidden,
} satisfies ModelDescriptionSchema;
}),
};
}),
/* [Gemini] models.generateContent = /v1/{model=models/*}:generateContent */
chatGenerate: publicProcedure
.input(chatGenerateInputSchema)
.output(openAIChatGenerateOutputSchema)
.mutation(async ({ input: { access, history, model } }) => {
// generate the content
const wireGeneration = await geminiPOST(access, model.id, geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1), geminiModelsGenerateContentPath);
const generation = geminiGeneratedContentResponseSchema.parse(wireGeneration);
// only use the first result (and there should be only one)
const singleCandidate = generation.candidates?.[0] ?? null;
if (!singleCandidate || !singleCandidate.content?.parts.length)
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Gemini chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
});
if (!('text' in singleCandidate.content.parts[0]))
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Gemini non-text chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
});
return {
role: 'assistant',
content: singleCandidate.content.parts[0].text || '',
finish_reason: singleCandidate.finishReason === 'STOP' ? 'stop' : null,
};
}),
});
@@ -0,0 +1,188 @@
import { z } from 'zod';
// PATHS
export const geminiModelsListPath = '/v1beta/models?pageSize=1000';
export const geminiModelsGenerateContentPath = '/v1beta/{model=models/*}:generateContent';
// see alt=sse on https://cloud.google.com/apis/docs/system-parameters#definitions
export const geminiModelsStreamGenerateContentPath = '/v1beta/{model=models/*}:streamGenerateContent?alt=sse';
// models.list = /v1beta/models
export const geminiModelsListOutputSchema = z.object({
models: z.array(z.object({
name: z.string(),
version: z.string(),
displayName: z.string(),
description: z.string(),
inputTokenLimit: z.number().int().min(1),
outputTokenLimit: z.number().int().min(1),
supportedGenerationMethods: z.array(z.enum([
'countMessageTokens',
'countTextTokens',
'countTokens',
'createTunedTextModel',
'embedContent',
'embedText',
'generateAnswer',
'generateContent',
'generateMessage',
'generateText',
])),
temperature: z.number().optional(),
topP: z.number().optional(),
topK: z.number().optional(),
})),
});
// /v1/{model=models/*}:generateContent, /v1beta/{model=models/*}:streamGenerateContent
// Request
const geminiContentPartSchema = z.union([
// TextPart
z.object({
text: z.string().optional(),
}),
// InlineDataPart
z.object({
inlineData: z.object({
mimeType: z.string(),
data: z.string(), // base64-encoded string
}),
}),
// A predicted FunctionCall returned from the model
z.object({
functionCall: z.object({
name: z.string(),
args: z.record(z.any()), // JSON object format
}),
}),
// The result output of a FunctionCall
z.object({
functionResponse: z.object({
name: z.string(),
response: z.record(z.any()), // JSON object format
}),
}),
]);
const geminiToolSchema = z.object({
functionDeclarations: z.array(z.object({
name: z.string(),
description: z.string(),
parameters: z.record(z.any()).optional(), // Schema object format
})).optional(),
});
const geminiHarmCategorySchema = z.enum([
'HARM_CATEGORY_UNSPECIFIED',
'HARM_CATEGORY_DEROGATORY',
'HARM_CATEGORY_TOXICITY',
'HARM_CATEGORY_VIOLENCE',
'HARM_CATEGORY_SEXUAL',
'HARM_CATEGORY_MEDICAL',
'HARM_CATEGORY_DANGEROUS',
'HARM_CATEGORY_HARASSMENT',
'HARM_CATEGORY_HATE_SPEECH',
'HARM_CATEGORY_SEXUALLY_EXPLICIT',
'HARM_CATEGORY_DANGEROUS_CONTENT',
]);
export const geminiBlockSafetyLevelSchema = z.enum([
'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
'BLOCK_LOW_AND_ABOVE',
'BLOCK_MEDIUM_AND_ABOVE',
'BLOCK_ONLY_HIGH',
'BLOCK_NONE',
]);
export type GeminiBlockSafetyLevel = z.infer<typeof geminiBlockSafetyLevelSchema>;
const geminiSafetySettingSchema = z.object({
category: geminiHarmCategorySchema,
threshold: geminiBlockSafetyLevelSchema,
});
const geminiGenerationConfigSchema = z.object({
stopSequences: z.array(z.string()).optional(),
candidateCount: z.number().int().optional(),
maxOutputTokens: z.number().int().optional(),
temperature: z.number().optional(),
topP: z.number().optional(),
topK: z.number().int().optional(),
});
const geminiContentSchema = z.object({
// Must be either 'user' or 'model'. Optional but must be set if there are multiple "Content" objects in the parent array.
role: z.enum(['user', 'model']).optional(),
// Ordered Parts that constitute a single message. Parts may have different MIME types.
parts: z.array(geminiContentPartSchema),
});
export type GeminiContentSchema = z.infer<typeof geminiContentSchema>;
export const geminiGenerateContentRequest = z.object({
contents: z.array(geminiContentSchema),
tools: z.array(geminiToolSchema).optional(),
safetySettings: z.array(geminiSafetySettingSchema).optional(),
generationConfig: geminiGenerationConfigSchema.optional(),
});
export type GeminiGenerateContentRequest = z.infer<typeof geminiGenerateContentRequest>;
// Response
const geminiHarmProbabilitySchema = z.enum([
'HARM_PROBABILITY_UNSPECIFIED',
'NEGLIGIBLE',
'LOW',
'MEDIUM',
'HIGH',
]);
const geminiSafetyRatingSchema = z.object({
'category': geminiHarmCategorySchema,
'probability': geminiHarmProbabilitySchema,
'blocked': z.boolean().optional(),
});
const geminiFinishReasonSchema = z.enum([
'FINISH_REASON_UNSPECIFIED',
'STOP',
'MAX_TOKENS',
'SAFETY',
'RECITATION',
'OTHER',
]);
export const geminiGeneratedContentResponseSchema = z.object({
// either all requested candidates are returned or no candidates at all
// no candidates are returned only if there was something wrong with the prompt (see promptFeedback)
candidates: z.array(z.object({
index: z.number(),
content: geminiContentSchema,
finishReason: geminiFinishReasonSchema.optional(),
safetyRatings: z.array(geminiSafetyRatingSchema),
citationMetadata: z.object({
startIndex: z.number().optional(),
endIndex: z.number().optional(),
uri: z.string().optional(),
license: z.string().optional(),
}).optional(),
tokenCount: z.number().optional(),
// groundingAttributions: z.array(GroundingAttribution).optional(), // This field is populated for GenerateAnswer calls.
})).optional(),
// NOTE: promptFeedback is only send in the first chunk in a streaming response
promptFeedback: z.object({
blockReason: z.enum(['BLOCK_REASON_UNSPECIFIED', 'SAFETY', 'OTHER']).optional(),
safetyRatings: z.array(geminiSafetyRatingSchema).optional(),
}).optional(),
});
@@ -4,12 +4,30 @@ import { createParser as createEventsourceParser, EventSourceParseCallback, Even
import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE, serverFetchOrThrow } from '~/server/wire';
import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
import type { OpenAIWire } from './openai.wiretypes';
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
// Anthropic server imports
import type { AnthropicWire } from './anthropic/anthropic.wiretypes';
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from './anthropic/anthropic.router';
// Gemini server imports
import { geminiAccess, geminiAccessSchema, geminiGenerateContentTextPayload } from './gemini/gemini.router';
import { geminiGeneratedContentResponseSchema, geminiModelsStreamGenerateContentPath } from './gemini/gemini.wiretypes';
// Ollama server imports
import { wireOllamaChunkedOutputSchema } from './ollama/ollama.wiretypes';
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from './ollama/ollama.router';
// OpenAI server imports
import type { OpenAIWire } from './openai/openai.wiretypes';
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai/openai.router';
/**
* Event stream formats
* - 'sse' is the default format, and is used by all vendors except Ollama
* - 'json-nl' is used by Ollama
*/
type MuxingFormat = 'sse' | 'json-nl';
/**
@@ -20,77 +38,87 @@ import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
* The peculiarity of our parser is the injection of a JSON structure at the beginning of the stream, to
* communicate parameters before the text starts flowing to the client.
*/
export type AIStreamParser = (data: string) => { text: string, close: boolean };
type EventStreamFormat = 'sse' | 'json-nl';
type AIStreamParser = (data: string) => { text: string, close: boolean };
const chatStreamInputSchema = z.object({
access: z.union([anthropicAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
model: openAIModelSchema, history: openAIHistorySchema,
const chatStreamingInputSchema = z.object({
access: z.union([anthropicAccessSchema, geminiAccessSchema, ollamaAccessSchema, openAIAccessSchema]),
model: openAIModelSchema,
history: openAIHistorySchema,
});
export type ChatStreamInputSchema = z.infer<typeof chatStreamInputSchema>;
export type ChatStreamingInputSchema = z.infer<typeof chatStreamingInputSchema>;
const chatStreamFirstPacketSchema = z.object({
const chatStreamingFirstOutputPacketSchema = z.object({
model: z.string(),
});
export type ChatStreamFirstPacketSchema = z.infer<typeof chatStreamFirstPacketSchema>;
export type ChatStreamingFirstOutputPacketSchema = z.infer<typeof chatStreamingFirstOutputPacketSchema>;
export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Response> {
export async function llmStreamingRelayHandler(req: NextRequest): Promise<Response> {
// inputs - reuse the tRPC schema
const { access, model, history } = chatStreamInputSchema.parse(await req.json());
const body = await req.json();
const { access, model, history } = chatStreamingInputSchema.parse(body);
// begin event streaming from the OpenAI API
let headersUrl: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
// access/dialect dependent setup:
// - requestAccess: the headers and URL to use for the upstream API call
// - muxingFormat: the format of the event stream (sse or json-nl)
// - vendorStreamParser: the parser to use for the event stream
let upstreamResponse: Response;
let requestAccess: { headers: HeadersInit, url: string } = { headers: {}, url: '' };
let muxingFormat: MuxingFormat = 'sse';
let vendorStreamParser: AIStreamParser;
let eventStreamFormat: EventStreamFormat = 'sse';
try {
// prepare the API request data
let body: object;
switch (access.dialect) {
case 'anthropic':
headersUrl = anthropicAccess(access, '/v1/complete');
requestAccess = anthropicAccess(access, '/v1/complete');
body = anthropicChatCompletionPayload(model, history, true);
vendorStreamParser = createAnthropicStreamParser();
vendorStreamParser = createStreamParserAnthropic();
break;
case 'gemini':
requestAccess = geminiAccess(access, model.id, geminiModelsStreamGenerateContentPath);
body = geminiGenerateContentTextPayload(model, history, access.minSafetyLevel, 1);
vendorStreamParser = createStreamParserGemini(model.id.replace('models/', ''));
break;
case 'ollama':
headersUrl = ollamaAccess(access, OLLAMA_PATH_CHAT);
requestAccess = ollamaAccess(access, OLLAMA_PATH_CHAT);
body = ollamaChatCompletionPayload(model, history, true);
eventStreamFormat = 'json-nl';
vendorStreamParser = createOllamaChatCompletionStreamParser();
muxingFormat = 'json-nl';
vendorStreamParser = createStreamParserOllama();
break;
case 'azure':
case 'localai':
case 'mistral':
case 'oobabooga':
case 'openai':
case 'openrouter':
headersUrl = openAIAccess(access, model.id, '/v1/chat/completions');
requestAccess = openAIAccess(access, model.id, '/v1/chat/completions');
body = openAIChatCompletionPayload(model, history, null, null, 1, true);
vendorStreamParser = createOpenAIStreamParser();
vendorStreamParser = createStreamParserOpenAI();
break;
}
if (SERVER_DEBUG_WIRE)
console.log('-> streaming:', debugGenerateCurlCommand('POST', headersUrl.url, headersUrl.headers, body));
console.log('-> streaming:', debugGenerateCurlCommand('POST', requestAccess.url, requestAccess.headers, body));
// POST to our API route
upstreamResponse = await serverFetchOrThrow(headersUrl.url, 'POST', headersUrl.headers, body);
upstreamResponse = await serverFetchOrThrow(requestAccess.url, 'POST', requestAccess.headers, body);
} catch (error: any) {
const fetchOrVendorError = safeErrorString(error) + (error?.cause ? ' · ' + error.cause : '');
// server-side admins message
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, headersUrl?.url);
console.error(`/api/llms/stream: fetch issue:`, access.dialect, fetchOrVendorError, requestAccess?.url);
// client-side users visible message
return new NextResponse(`[Issue] ${access.dialect}: ${fetchOrVendorError}`
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${headersUrl?.url}]` : ''), { status: 500 });
+ (process.env.NODE_ENV === 'development' ? ` · [URL: ${requestAccess?.url}]` : ''), { status: 500 });
}
/* The following code is heavily inspired by the Vercel AI SDK, but simplified to our needs and in full control.
@@ -102,8 +130,12 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
* NOTE: we have not benchmarked to see if there is performance impact by using this approach - we do want to have
* a 'healthy' level of inventory (i.e., pre-buffering) on the pipe to the client.
*/
const chatResponseStream = (upstreamResponse.body || createEmptyReadableStream())
.pipeThrough(createEventStreamTransformer(vendorStreamParser, eventStreamFormat, access.dialect));
const transformUpstreamToBigAgiClient = createEventStreamTransformer(
muxingFormat, vendorStreamParser, access.dialect,
);
const chatResponseStream =
(upstreamResponse.body || createEmptyReadableStream())
.pipeThrough(transformUpstreamToBigAgiClient);
return new NextResponse(chatResponseStream, {
status: 200,
@@ -114,110 +146,44 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
}
/// Event Parsers
function createAnthropicStreamParser(): AIStreamParser {
let hasBegun = false;
return (data: string) => {
const json: AnthropicWire.Complete.Response = JSON.parse(data);
let text = json.completion;
// hack: prepend the model name to the first packet
if (!hasBegun) {
hasBegun = true;
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: false };
};
}
function createOllamaChatCompletionStreamParser(): AIStreamParser {
let hasBegun = false;
return (data: string) => {
// parse the JSON chunk
let wireJsonChunk: any;
try {
wireJsonChunk = JSON.parse(data);
} catch (error: any) {
// log the malformed data to the console, and rethrow to transmit as 'error'
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
throw error;
}
// validate chunk
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
// process output
let text = chunk.message?.content || /*chunk.response ||*/ '';
// hack: prepend the model name to the first packet
if (!hasBegun && chunk.model) {
hasBegun = true;
const firstPacket: ChatStreamFirstPacketSchema = { model: chunk.model };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: chunk.done };
};
}
function createOpenAIStreamParser(): AIStreamParser {
let hasBegun = false;
let hasWarned = false;
return (data: string) => {
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
if (json.error)
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
// [OpenAI] if there's a warning, log it once
if (json.warning && !hasWarned) {
hasWarned = true;
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
}
if (json.choices.length !== 1) {
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
if (json.id === '' && json.object === '' && json.model === '')
return { text: '', close: false };
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
}
const index = json.choices[0].index;
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
throw new Error(`Expected completion index 0, got ${index}`);
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
// hack: prepend the model name to the first packet
if (!hasBegun) {
hasBegun = true;
const firstPacket: ChatStreamFirstPacketSchema = { model: json.model };
text = JSON.stringify(firstPacket) + text;
}
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
const close = !!json.choices[0].finish_reason;
return { text, close };
};
}
// Event Stream Transformers
/**
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
* Ollama is the only vendor that uses this format.
*/
function createDemuxerJsonNewline(onParse: EventSourceParseCallback): EventSourceParser {
let accumulator: string = '';
return {
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
feed: (chunk: string): void => {
accumulator += chunk;
if (accumulator.endsWith('\n')) {
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
const mimicEvent: ParsedEvent = {
type: 'event',
id: undefined,
event: undefined,
data: jsonString,
};
onParse(mimicEvent);
}
accumulator = '';
}
},
// resets the parser state - not useful with our driving of the parser
reset: (): void => {
console.error('createDemuxerJsonNewline.reset() not implemented');
},
};
}
/**
* Creates a TransformStream that parses events from an EventSource stream using a custom parser.
* @returns {TransformStream<Uint8Array, string>} TransformStream parsing events.
*/
function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFormat: EventStreamFormat, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
function createEventStreamTransformer(muxingFormat: MuxingFormat, vendorTextParser: AIStreamParser, dialectLabel: string): TransformStream<Uint8Array, Uint8Array> {
const textDecoder = new TextDecoder();
const textEncoder = new TextEncoder();
let eventSourceParser: EventSourceParser;
@@ -255,15 +221,15 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
} catch (error: any) {
if (SERVER_DEBUG_WIRE)
console.log(' - E: parse issue:', event.data, error?.message || error);
controller.enqueue(textEncoder.encode(`[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}`));
controller.enqueue(textEncoder.encode(` **[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}**`));
controller.terminate();
}
};
if (inputFormat === 'sse')
if (muxingFormat === 'sse')
eventSourceParser = createEventsourceParser(onNewEvent);
else if (inputFormat === 'json-nl')
eventSourceParser = createJsonNewlineParser(onNewEvent);
else if (muxingFormat === 'json-nl')
eventSourceParser = createDemuxerJsonNewline(onNewEvent);
},
// stream=true is set because the data is not guaranteed to be final and un-chunked
@@ -273,33 +239,142 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
});
}
/**
* Creates a parser for a 'JSON\n' non-event stream, to be swapped with an EventSource parser.
* Ollama is the only vendor that uses this format.
*/
function createJsonNewlineParser(onParse: EventSourceParseCallback): EventSourceParser {
let accumulator: string = '';
return {
// feeds a new chunk to the parser - we accumulate in case of partial data, and only execute on full lines
feed: (chunk: string): void => {
accumulator += chunk;
if (accumulator.endsWith('\n')) {
for (const jsonString of accumulator.split('\n').filter(line => !!line)) {
const mimicEvent: ParsedEvent = {
type: 'event',
id: undefined,
event: undefined,
data: jsonString,
};
onParse(mimicEvent);
}
accumulator = '';
}
},
// resets the parser state - not useful with our driving of the parser
reset: (): void => {
console.error('createJsonNewlineParser.reset() not implemented');
},
/// Stream Parsers
function createStreamParserAnthropic(): AIStreamParser {
let hasBegun = false;
return (data: string) => {
const json: AnthropicWire.Complete.Response = JSON.parse(data);
let text = json.completion;
// hack: prepend the model name to the first packet
if (!hasBegun) {
hasBegun = true;
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: false };
};
}
function createStreamParserGemini(modelName: string): AIStreamParser {
let hasBegun = false;
// this can throw, it's catched upstream
return (data: string) => {
// parse the JSON chunk
const wireGenerationChunk = JSON.parse(data);
const generationChunk = geminiGeneratedContentResponseSchema.parse(wireGenerationChunk);
// Prompt Safety Errors: pass through errors from Gemini
if (generationChunk.promptFeedback?.blockReason) {
const { blockReason, safetyRatings } = generationChunk.promptFeedback;
return { text: `[Gemini Prompt Blocked] ${blockReason}: ${JSON.stringify(safetyRatings || 'Unknown Safety Ratings', null, 2)}`, close: true };
}
// expect a single completion
const singleCandidate = generationChunk.candidates?.[0] ?? null;
if (!singleCandidate || !singleCandidate.content?.parts.length)
throw new Error(`Gemini: expected 1 completion, got ${generationChunk.candidates?.length}`);
// expect a single part
if (singleCandidate.content.parts.length !== 1 || !('text' in singleCandidate.content.parts[0]))
throw new Error(`Gemini: expected 1 text part, got ${singleCandidate.content.parts.length}`);
// expect a single text in the part
let text = singleCandidate.content.parts[0].text || '';
// hack: prepend the model name to the first packet
if (!hasBegun) {
hasBegun = true;
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: modelName };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: false };
};
}
function createStreamParserOllama(): AIStreamParser {
let hasBegun = false;
return (data: string) => {
// parse the JSON chunk
let wireJsonChunk: any;
try {
wireJsonChunk = JSON.parse(data);
} catch (error: any) {
// log the malformed data to the console, and rethrow to transmit as 'error'
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
throw error;
}
// validate chunk
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
// pass through errors from Ollama
if ('error' in chunk)
throw new Error(chunk.error);
// process output
let text = chunk.message?.content || /*chunk.response ||*/ '';
// hack: prepend the model name to the first packet
if (!hasBegun && chunk.model) {
hasBegun = true;
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: chunk.model };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: chunk.done };
};
}
function createStreamParserOpenAI(): AIStreamParser {
let hasBegun = false;
let hasWarned = false;
return (data: string) => {
const json: OpenAIWire.ChatCompletion.ResponseStreamingChunk = JSON.parse(data);
// [OpenAI] an upstream error will be handled gracefully and transmitted as text (throw to transmit as 'error')
if (json.error)
return { text: `[OpenAI Issue] ${safeErrorString(json.error)}`, close: true };
// [OpenAI] if there's a warning, log it once
if (json.warning && !hasWarned) {
hasWarned = true;
console.log('/api/llms/stream: OpenAI upstream warning:', json.warning);
}
if (json.choices.length !== 1) {
// [Azure] we seem to 'prompt_annotations' or 'prompt_filter_results' objects - which we will ignore to suppress the error
if (json.id === '' && json.object === '' && json.model === '')
return { text: '', close: false };
throw new Error(`Expected 1 completion, got ${json.choices.length}`);
}
const index = json.choices[0].index;
if (index !== 0 && index !== undefined /* LocalAI hack/workaround until https://github.com/go-skynet/LocalAI/issues/788 */)
throw new Error(`Expected completion index 0, got ${index}`);
let text = json.choices[0].delta?.content /*|| json.choices[0]?.text*/ || '';
// hack: prepend the model name to the first packet
if (!hasBegun) {
hasBegun = true;
const firstPacket: ChatStreamingFirstOutputPacketSchema = { model: json.model };
text = JSON.stringify(firstPacket) + text;
}
// [LocalAI] workaround: LocalAI doesn't send the [DONE] event, but similarly to OpenAI, it sends a "finish_reason" delta update
const close = !!json.choices[0].finish_reason;
return { text, close };
};
}
@@ -1,11 +1,18 @@
import { z } from 'zod';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../store-llms';
// Model Description: a superset of LLM model descriptors
const pricingSchema = z.object({
cpmPrompt: z.number().optional(), // Cost per thousand prompt tokens
cpmCompletion: z.number().optional(), // Cost per thousand completion tokens
});
// const rateLimitsSchema = z.object({
// reqPerMinute: z.number().optional(),
// });
const modelDescriptionSchema = z.object({
id: z.string(),
label: z.string(),
@@ -15,9 +22,12 @@ const modelDescriptionSchema = z.object({
contextWindow: z.number(),
maxCompletionTokens: z.number().optional(),
pricing: pricingSchema.optional(),
// rateLimits: rateLimitsSchema.optional(),
interfaces: z.array(z.enum([LLM_IF_OAI_Chat, LLM_IF_OAI_Fn, LLM_IF_OAI_Complete, LLM_IF_OAI_Vision])),
hidden: z.boolean().optional(),
});
// this is also used by the Client
export type ModelDescriptionSchema = z.infer<typeof modelDescriptionSchema>;
export const listModelsOutputSchema = z.object({
@@ -6,54 +6,59 @@
* from: https://ollama.ai/library?sort=featured
*/
export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 2353, added: '20231129' },
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 3089, added: '20231129' },
'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 70300 },
'yi': { description: 'A high-performing, bilingual base model.', pulls: 2673 },
'llama2': { description: 'The most popular model for general use.', pulls: 141000 },
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 71400 },
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 30900 },
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 26000 },
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 21800 },
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 13700 },
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 10600 },
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 10200 },
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9895 },
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9256 },
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8827 },
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7849 },
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7375 },
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 7335, added: '20231129' },
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 6726 },
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6272 },
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5978 },
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 5854, added: '20231129' },
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5040 },
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4648 },
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4536 },
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 3638 },
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 3638 },
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3485 },
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 3438, added: '20231129' },
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3145 },
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3023 },
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2775 },
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2192 },
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 1973 },
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1915 },
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1690 },
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 1667, added: '20231129' },
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1379 },
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1345 },
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1318, added: '20231129' },
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1302 },
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1254 },
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 946, added: '20231129' },
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 945, added: '20231210' },
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 860 },
'magicoder': { description: '🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.', pulls: 816, added: '20231210' },
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 804, added: '20231129' },
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 706 },
'llama2': { description: 'The most popular model for general use.', pulls: 165600 },
'mistral': { description: 'The 7B model released by Mistral AI, updated to version 0.2', pulls: 92200 },
'llava': { description: '🌋 A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding.', pulls: 3563, added: '20231215' },
'mixtral': { description: 'A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.', pulls: 8277, added: '20231215' },
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 3657, added: '20231129' },
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 4647, added: '20231129' },
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 79800 },
'dolphin-mixtral': { description: 'An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks. Created by Eric Hartford.', pulls: 48400, added: '20231215' },
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 36600 },
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 30000 },
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 22700 },
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 15300 },
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 11500 },
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 11200 },
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 10700 },
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 10200 },
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9842 },
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 9071 },
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 8328 },
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 8111 },
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 7492, added: '20231129' },
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 7468 },
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6468 },
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 6397 },
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5347 },
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 5034 },
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4874 },
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 4686 },
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-1210.', pulls: 4496, added: '20231129' },
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 4331 },
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3722 },
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3668 },
'yi': { description: 'A high-performing, bilingual base model.', pulls: 3335 },
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3219 },
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 3087 },
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2518 },
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 2338 },
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 2216, added: '20231129' },
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 2201 },
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 1983, added: '20231210' },
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1790 },
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1732, added: '20231129' },
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1598 },
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1534 },
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1454 },
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1418 },
'phi': { description: 'Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.', pulls: 1304, added: '20231220' },
'bakllava': { description: 'BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.', pulls: 1189, added: '20231215' },
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 1140, added: '20231129' },
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 1060 },
'solar': { description: 'A compact, yet powerful 10.7B large language model designed for single-turn conversation.', pulls: 934 },
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 902, added: '20231129' },
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 868 },
};
// export const OLLAMA_LAST_UPDATE: string = '20231210';
export const OLLAMA_PREV_UPDATE: string = '20231129';
// export const OLLAMA_LAST_UPDATE: string = '20231220';
export const OLLAMA_PREV_UPDATE: string = '20231210';
@@ -1,15 +1,16 @@
import { z } from 'zod';
import { TRPCError } from '@trpc/server';
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
import { env } from '~/server/env.mjs';
import { fetchJsonOrTRPCError, fetchTextOrTRPCError } from '~/server/api/trpc.serverutils';
import { LLM_IF_OAI_Chat } from '../../../store-llms';
import { LLM_IF_OAI_Chat } from '../../store-llms';
import { capitalizeFirstLetter } from '~/common/util/textUtils';
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';
@@ -246,8 +247,17 @@ export const llmOllamaRouter = createTRPCRouter({
const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), OLLAMA_PATH_CHAT);
const generation = wireOllamaChunkedOutputSchema.parse(wireGeneration);
if ('error' in generation)
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Ollama chat-generation issue: ${generation.error}`,
});
if (!generation.message?.content)
throw new Error('Ollama chat generation (non-stream) issue: ' + JSON.stringify(wireGeneration));
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Ollama chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
});
return {
role: 'assistant',
@@ -43,27 +43,34 @@ export type WireOllamaChatCompletionInput = z.infer<typeof wireOllamaChatComplet
/**
* Chat Completion or Generation APIs - Streaming Response
*/
export const wireOllamaChunkedOutputSchema = z.object({
model: z.string(),
// created_at: z.string(), // commented because unused
export const wireOllamaChunkedOutputSchema = z.union([
// Chat Completion Chunk
z.object({
model: z.string(),
// created_at: z.string(), // commented because unused
// [Chat Completion] (exclusive with 'response')
message: z.object({
role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
content: z.string(),
}).optional(), // optional on the last message
// [Chat Completion] (exclusive with 'response')
message: z.object({
role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
content: z.string(),
}).optional(), // optional on the last message
// [Generation] (non-chat, exclusive with 'message')
//response: z.string().optional(),
// [Generation] (non-chat, exclusive with 'message')
//response: z.string().optional(),
done: z.boolean(),
done: z.boolean(),
// only on the last message
// context: z.array(z.number()), // non-chat endpoint
// total_duration: z.number(),
// prompt_eval_count: z.number(),
// prompt_eval_duration: z.number(),
// eval_count: z.number(),
// eval_duration: z.number(),
// only on the last message
// context: z.array(z.number()), // non-chat endpoint
// total_duration: z.number(),
// prompt_eval_count: z.number(),
// prompt_eval_duration: z.number(),
// eval_count: z.number(),
// eval_duration: z.number(),
});
}),
// Possible Error
z.object({
error: z.string(),
}),
]);
@@ -0,0 +1,33 @@
import { z } from 'zod';
// [Mistral] Models List API - Response
export const wireMistralModelsListOutputSchema = z.object({
id: z.string(),
object: z.literal('model'),
created: z.number(),
owned_by: z.string(),
root: z.null().optional(),
parent: z.null().optional(),
// permission: z.array(wireMistralModelsListPermissionsSchema)
});
// export type WireMistralModelsListOutput = z.infer<typeof wireMistralModelsListOutputSchema>;
/*
const wireMistralModelsListPermissionsSchema = z.object({
id: z.string(),
object: z.literal('model_permission'),
created: z.number(),
allow_create_engine: z.boolean(),
allow_sampling: z.boolean(),
allow_logprobs: z.boolean(),
allow_search_indices: z.boolean(),
allow_view: z.boolean(),
allow_fine_tuning: z.boolean(),
organization: z.string(),
group: z.null().optional(),
is_blocking: z.boolean()
});
*/
@@ -1,5 +1,9 @@
import type { ModelDescriptionSchema } from '../server.schemas';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
import { SERVER_DEBUG_WIRE } from '~/server/wire';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../store-llms';
import type { ModelDescriptionSchema } from '../llm.server.types';
import { wireMistralModelsListOutputSchema } from './mistral.wiretypes';
// [Azure] / [OpenAI]
@@ -203,6 +207,63 @@ export function localAIModelToModelDescription(modelId: string): ModelDescriptio
}
// [Mistral]
const _knownMistralChatModels: ManualMappings = [
{
idPrefix: 'mistral-medium',
label: 'Mistral Medium',
description: 'Mistral internal prototype model.',
contextWindow: 32768,
interfaces: [LLM_IF_OAI_Chat],
},
{
idPrefix: 'mistral-small',
label: 'Mistral Small',
description: 'Higher reasoning capabilities and more capabilities (English, French, German, Italian, Spanish, and Code)',
contextWindow: 32768,
interfaces: [LLM_IF_OAI_Chat],
},
{
idPrefix: 'mistral-tiny',
label: 'Mistral Tiny',
description: 'Used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial',
contextWindow: 32768,
interfaces: [LLM_IF_OAI_Chat],
},
{
idPrefix: 'mistral-embed',
label: 'Mistral Embed',
description: 'Mistral Medium on Mistral',
// output: 1024 dimensions
maxCompletionTokens: 1024, // HACK - it's 1024 dimensions, but those are not 'completion tokens'
contextWindow: 32768, // actually unknown, assumed from the other models
interfaces: [],
hidden: true,
},
];
export function mistralModelToModelDescription(_model: unknown): ModelDescriptionSchema {
const model = wireMistralModelsListOutputSchema.parse(_model);
return fromManualMapping(_knownMistralChatModels, model.id, model.created, undefined, {
idPrefix: model.id,
label: model.id.replaceAll(/[_-]/g, ' '),
description: 'New Mistral Model',
contextWindow: 32768,
interfaces: [LLM_IF_OAI_Chat], // assume..
hidden: true,
});
}
export function mistralModelsSort(a: ModelDescriptionSchema, b: ModelDescriptionSchema): number {
if (a.hidden && !b.hidden)
return 1;
if (!a.hidden && b.hidden)
return -1;
return a.id.localeCompare(b.id);
}
// [Oobabooga]
const _knownOobaboogaChatModels: ManualMappings = [];
@@ -236,8 +297,8 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
/**
* Created to reflect the doc page: https://openrouter.ai/docs
*
* Update prompt:
* "Please update the typescript object below (do not change the definition, just the object), based on the updated upstream documentation:"
* Update prompt (last updated 2023-12-12)
* "Please update the following typescript object (do not change the definition, just values, and do not miss any rows), based on the information provided thereafter:"
*
* fields:
* - cw: context window size (max tokens, total)
@@ -247,19 +308,24 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
*/
const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?: number; old?: boolean; unfilt?: boolean; } } = {
// 'openrouter/auto': { name: 'Auto (best for prompt)', cw: 128000, cp: undefined, cc: undefined, unfilt: undefined },
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B (beta)', cw: 4096, cp: 0, cc: 0, unfilt: true },
'openchat/openchat-7b': { name: 'OpenChat 7B (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
'undi95/toppy-m-7b': { name: 'Toppy M 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
'gryphe/mythomist-7b': { name: 'MythoMist 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B (beta)', cw: 4096, cp: 0.000155, cc: 0.000155, unfilt: true },
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct (beta)', cw: 8192, cp: 0.00045, cc: 0.00045, unfilt: true },
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2 (beta)', cw: 4096, cp: 0.00045, cc: 0.00045, unfilt: true },
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1 (beta)', cw: 32768, cp: 0.005, cc: 0.005, unfilt: true },
'haotian-liu/llava-13b': { name: 'Llava 13B (beta)', cw: 2048, cp: 0.005, cc: 0.005, unfilt: true },
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat (beta)', cw: 4096, cp: 0.000234533, cc: 0.000234533, unfilt: true },
'alpindale/goliath-120b': { name: 'Goliath 120B (beta)', cw: 6144, cp: 0.00703125, cc: 0.00703125, unfilt: true },
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B (beta)', cw: 4096, cp: 0.000562, cc: 0.000762, unfilt: true },
'nousresearch/nous-capybara-7b': { name: 'Nous: Capybara 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct', cw: 8192, cp: 0, cc: 0, unfilt: true },
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32768, cp: 0, cc: 0, unfilt: true },
'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.0001, cc: 0.0001, unfilt: true },
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32768, cp: 0.0003, cc: 0.0003, unfilt: true },
'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
'openai/gpt-3.5-turbo': { name: 'OpenAI: GPT-3.5 Turbo', cw: 4095, cp: 0.001, cc: 0.002, unfilt: false },
'openai/gpt-3.5-turbo-1106': { name: 'OpenAI: GPT-3.5 Turbo 16k (preview)', cw: 16385, cp: 0.001, cc: 0.002, unfilt: false },
'openai/gpt-3.5-turbo-16k': { name: 'OpenAI: GPT-3.5 Turbo 16k', cw: 16385, cp: 0.003, cc: 0.004, unfilt: false },
@@ -268,28 +334,44 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
'openai/gpt-4-32k': { name: 'OpenAI: GPT-4 32k', cw: 32767, cp: 0.06, cc: 0.12, unfilt: false },
'openai/gpt-4-vision-preview': { name: 'OpenAI: GPT-4 Vision (preview)', cw: 128000, cp: 0.01, cc: 0.03, unfilt: false },
'openai/gpt-3.5-turbo-instruct': { name: 'OpenAI: GPT-3.5 Turbo Instruct', cw: 4095, cp: 0.0015, cc: 0.002, unfilt: false },
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 9216, cp: 0.0005, cc: 0.0005, unfilt: true },
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B (beta)', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B (beta)', cw: 32000, cp: 0.02, cc: 0.02, unfilt: true },
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'migtissera/synthia-70b': { name: 'Synthia 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B (beta)', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B (beta)', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'neversleep/noromaid-20b': { name: 'Noromaid 20B (beta)', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
'google/palm-2-chat-bison': { name: 'Google: PaLM 2 Chat', cw: 36864, cp: 0.00025, cc: 0.0005, unfilt: true },
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 28672, cp: 0.00025, cc: 0.0005, unfilt: true },
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 131072, cp: 0.00025, cc: 0.0005, unfilt: true },
'google/gemini-pro': { name: 'Google: Gemini Pro (preview)', cw: 131040, cp: 0.00025, cc: 0.0005, unfilt: true },
'google/gemini-pro-vision': { name: 'Google: Gemini Pro Vision (preview)', cw: 65536, cp: 0.00025, cc: 0.0005, unfilt: true },
'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
'perplexity/pplx-70b-chat': { name: 'Perplexity: PPLX 70B Chat', cw: 4096, cp: 0.0007, cc: 0.0028, unfilt: true },
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0001425006, cc: 0.0001425006, unfilt: true },
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/toppy-m-7b': { name: 'Toppy M 7B', cw: 32768, cp: 0.000375, cc: 0.000375, unfilt: true },
'alpindale/goliath-120b': { name: 'Goliath 120B', cw: 6144, cp: 0.009375, cc: 0.009375, unfilt: true },
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'neversleep/noromaid-20b': { name: 'Noromaid 20B', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
'01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
'01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
'01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32768, cp: 0.0002, cc: 0.0002, unfilt: true },
'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32768, cp: 0.0006, cc: 0.0006, unfilt: true },
'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.0045, cc: 0.0045, unfilt: true },
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.003375, cc: 0.003375, unfilt: true },
'gryphe/mythomax-l2-13b': { name: 'MythoMax 13B', cw: 4096, cp: 0.0006, cc: 0.0006, unfilt: true },
// Old models (maintained for reference)
'openai/gpt-3.5-turbo-0301': { name: 'OpenAI: GPT-3.5 Turbo (older v0301)', cw: 4095, cp: 0.001, cc: 0.002, old: true },
'openai/gpt-4-0314': { name: 'OpenAI: GPT-4 (older v0314)', cw: 8191, cp: 0.03, cc: 0.06, old: true },
'openai/gpt-4-32k-0314': { name: 'OpenAI: GPT-4 32k (older v0314)', cw: 32767, cp: 0.06, cc: 0.12, old: true },
@@ -301,7 +383,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
'anthropic/claude-instant-1.0': { name: 'Anthropic: Claude Instant (older v1)', cw: 9000, cp: 0.00163, cc: 0.00551, old: true },
};
const orModelFamilyOrder = ['mistralai/', 'huggingfaceh4/', 'undi95/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/', 'openrouter/'];
const orModelFamilyOrder = [
// great models (pickes by hand, they're free)
'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
// great orgs
'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'mistralai/', 'openai/', 'meta-llama/', 'phind/',
];
export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
const aPrefixIndex = orModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
@@ -321,10 +408,10 @@ export function openRouterModelToModelDescription(modelId: string, created: numb
const orModel = orModelMap[modelId] ?? null;
let label = orModel?.name || modelId.replace('/', ' · ');
if (orModel?.cp === 0 && orModel?.cc === 0)
label += ' - 🎁 Free';
label += ' · 🎁'; // Free? Discounted?
// if (!orModel)
// console.log('openRouterModelToModelDescription: unknown model id:', modelId);
if (SERVER_DEBUG_WIRE && !orModel)
console.log(' - openRouterModelToModelDescription: non-mapped model id:', modelId);
// context: use the known size if available, otherwise fallback to the (undocumneted) provided length or fallback again to 4096
const contextWindow = orModel?.cw || context_length || 4096;
@@ -8,13 +8,13 @@ import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
import { Brand } from '~/common/app.config';
import type { OpenAIWire } from './openai.wiretypes';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
import { localAIModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../llm.server.types';
import { localAIModelToModelDescription, mistralModelsSort, mistralModelToModelDescription, oobaboogaModelToModelDescription, openAIModelToModelDescription, openRouterModelFamilySortFn, openRouterModelToModelDescription } from './models.data';
// Input Schemas
const openAIDialects = z.enum(['azure', 'localai', 'oobabooga', 'openai', 'openrouter']);
const openAIDialects = z.enum(['azure', 'localai', 'mistral', 'oobabooga', 'openai', 'openrouter']);
export const openAIAccessSchema = z.object({
dialect: openAIDialects,
@@ -186,12 +186,18 @@ export const llmOpenAIRouter = createTRPCRouter({
.map((model): ModelDescriptionSchema => openAIModelToModelDescription(model.id, model.created));
break;
case 'mistral':
models = openAIModels
.map(mistralModelToModelDescription)
.sort(mistralModelsSort);
break;
case 'openrouter':
models = openAIModels
.sort(openRouterModelFamilySortFn)
.map(model => openRouterModelToModelDescription(model.id, model.created, (model as any)?.['context_length']));
break;
}
return { models };
@@ -267,9 +273,10 @@ async function openaiPOST<TOut extends object, TPostBody extends object>(access:
}
const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';
const DEFAULT_MISTRAL_HOST = 'https://api.mistral.ai';
const DEFAULT_OPENAI_HOST = 'api.openai.com';
const DEFAULT_OPENROUTER_HOST = 'https://openrouter.ai/api';
const DEFAULT_HELICONE_OPENAI_HOST = 'oai.hconeai.com';
export function fixupHost(host: string, apiPath: string): string {
if (!host.startsWith('http'))
@@ -361,6 +368,20 @@ export function openAIAccess(access: OpenAIAccessSchema, modelRefId: string | nu
};
case 'mistral':
// https://docs.mistral.ai/platform/client
const mistralKey = access.oaiKey || env.MISTRAL_API_KEY || '';
const mistralHost = fixupHost(access.oaiHost || DEFAULT_MISTRAL_HOST, apiPath);
return {
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': `Bearer ${mistralKey}`,
},
url: mistralHost + apiPath,
};
case 'openrouter':
const orKey = access.oaiKey || env.OPENROUTER_API_KEY || '';
const orHost = fixupHost(access.oaiHost || DEFAULT_OPENROUTER_HOST, apiPath);
+30 -26
View File
@@ -2,7 +2,8 @@ import { create } from 'zustand';
import { shallow } from 'zustand/shallow';
import { persist } from 'zustand/middleware';
import { ModelVendorId } from './vendors/IModelVendor';
import type { ModelVendorId } from './vendors/vendors.registry';
import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';
/**
@@ -15,6 +16,7 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
updated?: number | 0;
description: string;
tags: string[]; // UNUSED for now
// modelcaps: DModelCapability[];
contextTokens: number;
maxOutputTokens: number;
hidden: boolean;
@@ -29,6 +31,17 @@ export interface DLLM<TSourceSetup = unknown, TLLMOptions = unknown> {
export type DLLMId = string;
// export type DModelCapability =
// | 'input-text'
// | 'input-image-data'
// | 'input-multipart'
// | 'output-text'
// | 'output-function'
// | 'output-image-data'
// | 'if-chat'
// | 'if-fast-chat'
// ;
// Model interfaces (chat, and function calls) - here as a preview, will be used more broadly in the future
export const LLM_IF_OAI_Chat = 'oai-chat';
export const LLM_IF_OAI_Vision = 'oai-vision';
@@ -76,6 +89,9 @@ interface ModelsActions {
setChatLLMId: (id: DLLMId | null) => void;
setFastLLMId: (id: DLLMId | null) => void;
setFuncLLMId: (id: DLLMId | null) => void;
// special
setOpenRoutersKey: (key: string) => void;
}
type LlmsStore = ModelsData & ModelsActions;
@@ -162,13 +178,22 @@ export const useModelsStore = create<LlmsStore>()(
set(state => ({
sources: state.sources.map((source: DModelSource): DModelSource =>
source.id === id
? {
...source,
setup: { ...source.setup, ...partialSetup },
} : source,
? { ...source, setup: { ...source.setup, ...partialSetup } }
: source,
),
})),
setOpenRoutersKey: (key: string) =>
set(state => {
const openRouterSource = state.sources.find(source => source.vId === 'openrouter');
if (!openRouterSource) return state;
return {
sources: state.sources.map(source => source.id === openRouterSource.id
? { ...source, setup: { ...source.setup, oaiKey: key satisfies SourceSetupOpenRouter['oaiKey'] } }
: source),
};
}),
}),
{
name: 'app-models',
@@ -256,24 +281,3 @@ export function useChatLLM() {
}, shallow);
}
/**
* Source-specific read/write - great time saver
*/
export function useSourceSetup<TSourceSetup, TAccess>(sourceId: DModelSourceId, getAccess: (partialSetup?: Partial<TSourceSetup>) => TAccess) {
// invalidate when the setup changes
const { updateSourceSetup, ...rest } = useModelsStore(state => {
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) ?? null;
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
return {
source,
sourceLLMs,
sourceHasLLMs: !!sourceLLMs.length,
access: getAccess(source?.setup),
updateSourceSetup: state.updateSourceSetup,
};
}, shallow);
// convenience function for this source
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
return { ...rest, updateSetup };
}
@@ -1,34 +0,0 @@
import type { DLLMId } from '../store-llms';
import type { OpenAIWire } from './server/openai/openai.wiretypes';
import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
export interface VChatMessageIn {
role: 'assistant' | 'system' | 'user'; // | 'function';
content: string;
//name?: string; // when role: 'function'
}
export type VChatFunctionIn = OpenAIWire.ChatCompletion.RequestFunctionDef;
export interface VChatMessageOut {
role: 'assistant' | 'system' | 'user';
content: string;
finish_reason: 'stop' | 'length' | null;
}
export interface VChatMessageOrFunctionCallOut extends VChatMessageOut {
function_name: string;
function_arguments: object | null;
}
export async function callChatGenerate(llmId: DLLMId, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
return await vendor.callChatGenerate(llm, messages, maxTokens);
}
export async function callChatGenerateWithFunctions(llmId: DLLMId, messages: VChatMessageIn[], functions: VChatFunctionIn[], forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
return await vendor.callChatGenerateWF(llm, messages, functions, forceFunctionName, maxTokens);
}
+37 -12
View File
@@ -1,18 +1,19 @@
import type React from 'react';
import type { TRPCClientErrorBase } from '@trpc/client';
import type { DLLM, DModelSourceId } from '../store-llms';
import { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../transports/chatGenerate';
import type { DLLM, DLLMId, DModelSourceId } from '../store-llms';
import type { ModelDescriptionSchema } from '../server/llm.server.types';
import type { ModelVendorId } from './vendors.registry';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '~/modules/llms/llm.client';
export type ModelVendorId = 'anthropic' | 'azure' | 'localai' | 'ollama' | 'oobabooga' | 'openai' | 'openrouter';
export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAccess = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
export interface IModelVendor<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown, TDLLM = DLLM<TSourceSetup, TLLMOptions>> {
readonly id: ModelVendorId;
readonly name: string;
readonly rank: number;
readonly location: 'local' | 'cloud';
readonly instanceLimit: number;
readonly hasFreeModels?: boolean;
readonly hasBackendCap?: () => boolean;
// components
@@ -20,12 +21,36 @@ export interface IModelVendor<TSourceSetup = unknown, TLLMOptions = unknown, TAc
readonly SourceSetupComponent: React.ComponentType<{ sourceId: DModelSourceId }>;
readonly LLMOptionsComponent: React.ComponentType<{ llm: TDLLM }>;
// functions
readonly initializeSetup?: () => TSourceSetup;
/// abstraction interface ///
getAccess(setup?: Partial<TSourceSetup>): TAccess;
initializeSetup?(): TSourceSetup;
callChatGenerate(llm: TDLLM, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut>;
validateSetup?(setup: TSourceSetup): boolean;
callChatGenerateWF(llm: TDLLM, messages: VChatMessageIn[], functions: null | VChatFunctionIn[], forceFunctionName: null | string, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut>;
}
getTransportAccess(setup?: Partial<TSourceSetup>): TAccess;
rpcUpdateModelsQuery: (
access: TAccess,
enabled: boolean,
onSuccess: (data: { models: ModelDescriptionSchema[] }) => void,
) => { isFetching: boolean, refetch: () => void, isError: boolean, error: TRPCClientErrorBase<any> | null };
rpcChatGenerateOrThrow: (
access: TAccess,
llmOptions: TLLMOptions,
messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
maxTokens?: number,
) => Promise<VChatMessageOut | VChatMessageOrFunctionCallOut>;
streamingChatGenerateOrThrow: (
access: TAccess,
llmId: DLLMId,
llmOptions: TLLMOptions,
messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
abortSignal: AbortSignal,
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
) => Promise<void>;
}
+6 -12
View File
@@ -7,11 +7,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { isValidAnthropicApiKey, ModelVendorAnthropic } from './anthropic.vendor';
@@ -23,7 +23,7 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceHasLLMs, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorAnthropic.getAccess);
useSourceSetup(props.sourceId, ModelVendorAnthropic);
// derived state
const { anthropicKey, anthropicHost, heliconeKey } = access;
@@ -34,14 +34,8 @@ export function AnthropicSourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = anthropicKey ? keyValid : (!needsUserKey || !!anthropicHost);
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmAnthropic.listModels.useQuery({ access }, {
enabled: !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorAnthropic, access, !sourceHasLLMs && shallFetchSucceed, source);
return <>
+43 -37
View File
@@ -1,11 +1,12 @@
import { backendCaps } from '~/modules/backend/state-backend';
import { AnthropicIcon } from '~/common/components/icons/AnthropicIcon';
import { apiAsync } from '~/common/util/trpc.client';
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
import type { AnthropicAccessSchema } from '../../server/anthropic/anthropic.router';
import type { IModelVendor } from '../IModelVendor';
import type { AnthropicAccessSchema } from '../../transports/server/anthropic/anthropic.router';
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { VChatMessageOut } from '../../llm.client';
import { unifiedStreamingClient } from '../unifiedStreamingClient';
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
@@ -14,7 +15,7 @@ import { AnthropicSourceSetup } from './AnthropicSourceSetup';
// special symbols
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length > 40 : apiKey.length >= 40);
export const isValidAnthropicApiKey = (apiKey?: string) => !!apiKey && (apiKey.startsWith('sk-') ? apiKey.length >= 39 : apiKey.length >= 40);
export interface SourceSetupAnthropic {
anthropicKey: string;
@@ -22,7 +23,7 @@ export interface SourceSetupAnthropic {
heliconeKey: string;
}
export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptionsOpenAI, AnthropicAccessSchema> = {
export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, AnthropicAccessSchema, LLMOptionsOpenAI> = {
id: 'anthropic',
name: 'Anthropic',
rank: 13,
@@ -36,43 +37,48 @@ export const ModelVendorAnthropic: IModelVendor<SourceSetupAnthropic, LLMOptions
LLMOptionsComponent: OpenAILLMOptions,
// functions
getAccess: (partialSetup): AnthropicAccessSchema => ({
getTransportAccess: (partialSetup): AnthropicAccessSchema => ({
dialect: 'anthropic',
anthropicKey: partialSetup?.anthropicKey || '',
anthropicHost: partialSetup?.anthropicHost || null,
heliconeKey: partialSetup?.heliconeKey || null,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return anthropicCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, /*null, null,*/ maxTokens);
// List Models
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
return apiQuery.llmAnthropic.listModels.useQuery({ access }, {
enabled: enabled,
onSuccess: onSuccess,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
},
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
throw new Error('Anthropic does not support "Functions" yet');
// Chat Generate (non-streaming) with Functions
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
if (functions?.length || forceFunctionName)
throw new Error('Anthropic does not support functions');
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmAnthropic.chatGenerate.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
history: messages,
}) as VChatMessageOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
console.error(`anthropic.rpcChatGenerateOrThrow: ${errorMessage}`);
throw new Error(errorMessage);
}
},
// Chat Generate (streaming) with Functions
streamingChatGenerateOrThrow: unifiedStreamingClient,
};
/**
* This function either returns the LLM message, or function calls, or throws a descriptive error string
*/
async function anthropicCallChatGenerate<TOut = VChatMessageOut>(
access: AnthropicAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
// functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
maxTokens?: number,
): Promise<TOut> {
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmAnthropic.chatGenerate.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
history: messages,
}) as TOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'Anthropic Chat Generate Error';
console.error(`anthropicCallChatGenerate: ${errorMessage}`);
throw new Error(errorMessage);
}
}
+6 -12
View File
@@ -5,11 +5,11 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { asValidURL } from '~/common/util/urlUtils';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { isValidAzureApiKey, ModelVendorAzure } from './azure.vendor';
@@ -18,7 +18,7 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceHasLLMs, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorAzure.getAccess);
useSourceSetup(props.sourceId, ModelVendorAzure);
// derived state
const { oaiKey: azureKey, oaiHost: azureEndpoint } = access;
@@ -31,14 +31,8 @@ export function AzureSourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = azureKey ? keyValid : !needsUserKey;
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorAzure, access, !sourceHasLLMs && shallFetchSucceed, source);
return <>
+9 -11
View File
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
import { AzureIcon } from '~/common/components/icons/AzureIcon';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { AzureSourceSetup } from './AzureSourceSetup';
@@ -36,7 +35,7 @@ export interface SourceSetupAzure {
*
* Work in progress...
*/
export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI, OpenAIAccessSchema> = {
export const ModelVendorAzure: IModelVendor<SourceSetupAzure, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'azure',
name: 'Azure',
rank: 14,
@@ -50,7 +49,7 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
LLMOptionsComponent: OpenAILLMOptions,
// functions
getAccess: (partialSetup): OpenAIAccessSchema => ({
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
dialect: 'azure',
oaiKey: partialSetup?.azureKey || '',
oaiOrg: '',
@@ -58,10 +57,9 @@ export const ModelVendorAzure: IModelVendor<SourceSetupAzure, LLMOptionsOpenAI,
heliKey: '',
moderationCheck: false,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
},
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
},
// OpenAI transport ('azure' dialect in 'access')
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
};
+96
View File
@@ -0,0 +1,96 @@
import * as React from 'react';
import { FormControl, FormHelperText, Option, Select } from '@mui/joy';
import HealthAndSafetyIcon from '@mui/icons-material/HealthAndSafety';
import { FormInputKey } from '~/common/components/forms/FormInputKey';
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import type { DModelSourceId } from '../../store-llms';
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { ModelVendorGemini } from './gemini.vendor';
const GEMINI_API_KEY_LINK = 'https://makersuite.google.com/app/apikey';
const SAFETY_OPTIONS: { value: GeminiBlockSafetyLevel, label: string }[] = [
{ value: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED', label: 'Default' },
{ value: 'BLOCK_LOW_AND_ABOVE', label: 'Low and above' },
{ value: 'BLOCK_MEDIUM_AND_ABOVE', label: 'Medium and above' },
{ value: 'BLOCK_ONLY_HIGH', label: 'Only high' },
{ value: 'BLOCK_NONE', label: 'None' },
];
export function GeminiSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceSetupValid, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorGemini);
// derived state
const { geminiKey, minSafetyLevel } = access;
const needsUserKey = !ModelVendorGemini.hasBackendCap?.();
const shallFetchSucceed = !needsUserKey || (!!geminiKey && sourceSetupValid);
const showKeyError = !!geminiKey && !sourceSetupValid;
// fetch models
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorGemini, access, shallFetchSucceed, source);
return <>
<FormInputKey
id='gemini-key' label='Gemini API Key'
rightLabel={<>{needsUserKey
? !geminiKey && <Link level='body-sm' href={GEMINI_API_KEY_LINK} target='_blank'>request Key</Link>
: '✔️ already set in server'}
</>}
value={geminiKey} onChange={value => updateSetup({ geminiKey: value.trim() })}
required={needsUserKey} isError={showKeyError}
placeholder='...'
/>
<FormControl orientation='horizontal' sx={{ justifyContent: 'space-between', alignItems: 'center' }}>
<FormLabelStart title='Safety Settings'
description='Threshold' />
<Select
variant='outlined'
value={minSafetyLevel} onChange={(_event, value) => value && updateSetup({ minSafetyLevel: value })}
startDecorator={<HealthAndSafetyIcon sx={{ display: { xs: 'none', sm: 'inherit' } }} />}
// indicator={<KeyboardArrowDownIcon />}
slotProps={{
root: { sx: { width: '100%' } },
indicator: { sx: { opacity: 0.5 } },
button: { sx: { whiteSpace: 'inherit' } },
}}
>
{SAFETY_OPTIONS.map(option => (
<Option key={'gemini-safety-' + option.value} value={option.value}>{option.label}</Option>
))}
</Select>
</FormControl>
<FormHelperText sx={{ display: 'block' }}>
Gemini has <Link href='https://ai.google.dev/docs/safety_setting_gemini' target='_blank' noLinkStyle>
adjustable safety settings</Link> on four categories: Harassment, Hate speech,
Sexually explicit, and Dangerous content, in addition to non-adjustable built-in filters.
By default, the model will block content with <em>medium and above</em> probability
of being unsafe.
</FormHelperText>
<SetupFormRefetchButton
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
/>
{isError && <InlineError error={error} />}
</>;
}
+97
View File
@@ -0,0 +1,97 @@
import GoogleIcon from '@mui/icons-material/Google';
import { backendCaps } from '~/modules/backend/state-backend';
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
import type { GeminiAccessSchema } from '../../server/gemini/gemini.router';
import type { GeminiBlockSafetyLevel } from '../../server/gemini/gemini.wiretypes';
import type { IModelVendor } from '../IModelVendor';
import type { VChatMessageOut } from '../../llm.client';
import { unifiedStreamingClient } from '../unifiedStreamingClient';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { GeminiSourceSetup } from './GeminiSourceSetup';
export interface SourceSetupGemini {
geminiKey: string;
minSafetyLevel: GeminiBlockSafetyLevel;
}
export interface LLMOptionsGemini {
llmRef: string;
stopSequences: string[]; // up to 5 sequences that will stop generation (optional)
candidateCount: number; // 1...8 number of generated responses to return (optional)
maxOutputTokens: number; // if unset, this will default to outputTokenLimit (optional)
temperature: number; // 0...1 Controls the randomness of the output. (optional)
topP: number; // 0...1 The maximum cumulative probability of tokens to consider when sampling (optional)
topK: number; // 1...100 The maximum number of tokens to consider when sampling (optional)
}
export const ModelVendorGemini: IModelVendor<SourceSetupGemini, GeminiAccessSchema, LLMOptionsGemini> = {
id: 'googleai',
name: 'Gemini',
rank: 11,
location: 'cloud',
instanceLimit: 1,
hasBackendCap: () => backendCaps().hasLlmGemini,
// components
Icon: GoogleIcon,
SourceSetupComponent: GeminiSourceSetup,
LLMOptionsComponent: OpenAILLMOptions,
// functions
initializeSetup: () => ({
geminiKey: '',
minSafetyLevel: 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
}),
validateSetup: (setup) => {
return setup.geminiKey?.length > 0;
},
getTransportAccess: (partialSetup): GeminiAccessSchema => ({
dialect: 'gemini',
geminiKey: partialSetup?.geminiKey || '',
minSafetyLevel: partialSetup?.minSafetyLevel || 'HARM_BLOCK_THRESHOLD_UNSPECIFIED',
}),
// List Models
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
return apiQuery.llmGemini.listModels.useQuery({ access }, {
enabled: enabled,
onSuccess: onSuccess,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
},
// Chat Generate (non-streaming) with Functions
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
if (functions?.length || forceFunctionName)
throw new Error('Gemini does not support functions');
const { llmRef, temperature = 0.5, maxOutputTokens } = llmOptions;
try {
return await apiAsync.llmGemini.chatGenerate.mutate({
access,
model: {
id: llmRef!,
temperature: temperature,
maxTokens: maxTokens || maxOutputTokens || 1024,
},
history: messages,
}) as VChatMessageOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'Gemini Chat Generate Error';
console.error(`gemini.rpcChatGenerateOrThrow: ${errorMessage}`);
throw new Error(errorMessage);
}
},
// Chat Generate (streaming) with Functions
streamingChatGenerateOrThrow: unifiedStreamingClient,
};
+6 -12
View File
@@ -7,10 +7,10 @@ import { FormInputKey } from '~/common/components/forms/FormInputKey';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { ModelVendorLocalAI } from './localai.vendor';
@@ -19,7 +19,7 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorLocalAI.getAccess);
useSourceSetup(props.sourceId, ModelVendorLocalAI);
// derived state
const { oaiHost } = access;
@@ -30,14 +30,8 @@ export function LocalAISourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = isValidHost;
// fetch models - the OpenAI way
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorLocalAI, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
return <>
+10 -12
View File
@@ -1,10 +1,9 @@
import DevicesIcon from '@mui/icons-material/Devices';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { LocalAISourceSetup } from './LocalAISourceSetup';
@@ -14,7 +13,7 @@ export interface SourceSetupLocalAI {
oaiHost: string; // use OpenAI-compatible non-default hosts (full origin path)
}
export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'localai',
name: 'LocalAI',
rank: 20,
@@ -30,7 +29,7 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
initializeSetup: () => ({
oaiHost: 'http://localhost:8080',
}),
getAccess: (partialSetup) => ({
getTransportAccess: (partialSetup) => ({
dialect: 'localai',
oaiKey: '',
oaiOrg: '',
@@ -38,10 +37,9 @@ export const ModelVendorLocalAI: IModelVendor<SourceSetupLocalAI, LLMOptionsOpen
heliKey: '',
moderationCheck: false,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
},
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
},
};
// OpenAI transport ('localai' dialect in 'access')
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
};
+55
View File
@@ -0,0 +1,55 @@
import * as React from 'react';
import { FormInputKey } from '~/common/components/forms/FormInputKey';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { ModelVendorMistral } from './mistral.vendor';
const MISTRAL_REG_LINK = 'https://console.mistral.ai/';
export function MistralSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceSetupValid, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorMistral);
// derived state
const { oaiKey: mistralKey } = access;
const needsUserKey = !ModelVendorMistral.hasBackendCap?.();
const shallFetchSucceed = !needsUserKey || (!!mistralKey && sourceSetupValid);
const showKeyError = !!mistralKey && !sourceSetupValid;
// fetch models
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorMistral, access, shallFetchSucceed, source);
return <>
<FormInputKey
id='mistral-key' label='Mistral Key'
rightLabel={<>{needsUserKey
? !mistralKey && <Link level='body-sm' href={MISTRAL_REG_LINK} target='_blank'>request Key</Link>
: '✔️ already set in server'}
</>}
value={mistralKey} onChange={value => updateSetup({ oaiKey: value })}
required={needsUserKey} isError={showKeyError}
placeholder='...'
/>
<SetupFormRefetchButton
refetch={refetch} disabled={/*!shallFetchSucceed ||*/ isFetching} error={isError}
/>
{isError && <InlineError error={error} />}
</>;
}
+55
View File
@@ -0,0 +1,55 @@
import { backendCaps } from '~/modules/backend/state-backend';
import { MistralIcon } from '~/common/components/icons/MistralIcon';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import { LLMOptionsOpenAI, ModelVendorOpenAI, SourceSetupOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { MistralSourceSetup } from './MistralSourceSetup';
// special symbols
export type SourceSetupMistral = Pick<SourceSetupOpenAI, 'oaiKey' | 'oaiHost'>;
/** Implementation Notes for the Mistral vendor
*/
export const ModelVendorMistral: IModelVendor<SourceSetupMistral, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'mistral',
name: 'Mistral',
rank: 15,
location: 'cloud',
instanceLimit: 1,
hasBackendCap: () => backendCaps().hasLlmMistral,
// components
Icon: MistralIcon,
SourceSetupComponent: MistralSourceSetup,
LLMOptionsComponent: OpenAILLMOptions,
// functions
initializeSetup: () => ({
oaiHost: 'https://api.mistral.ai/',
oaiKey: '',
}),
validateSetup: (setup) => {
return setup.oaiKey?.length >= 32;
},
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
dialect: 'mistral',
oaiKey: partialSetup?.oaiKey || '',
oaiOrg: '',
oaiHost: partialSetup?.oaiHost || '',
heliKey: '',
moderationCheck: false,
}),
// OpenAI transport ('mistral' dialect in 'access')
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
};
+3 -3
View File
@@ -12,7 +12,7 @@ import { Link } from '~/common/components/Link';
import { apiQuery } from '~/common/util/trpc.client';
import { settingsGap } from '~/common/app.theme';
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {
@@ -68,7 +68,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
>
{pullable.map(p =>
<Option key={p.id} value={p.id}>
{p.isNew === true && <Chip size='sm' variant='outlined'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
{p.isNew === true && <Chip size='sm' variant='solid'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
</Option>,
)}
</Select>
@@ -118,7 +118,7 @@ export function OllamaAdministration(props: { access: OllamaAccessSchema, onClos
{pullModelDescription}
</Typography>
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1 }}>
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1, alignItems: 'start' }}>
<Button
variant='outlined'
color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
+7 -12
View File
@@ -6,13 +6,14 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { asValidURL } from '~/common/util/urlUtils';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { ModelVendorOllama } from './ollama.vendor';
import { OllamaAdministration } from './OllamaAdministration';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
@@ -22,7 +23,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorOllama.getAccess);
useSourceSetup(props.sourceId, ModelVendorOllama);
// derived state
const { ollamaHost } = access;
@@ -32,14 +33,8 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = !hostError;
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmOllama.listModels.useQuery({ access }, {
enabled: false, // !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorOllama, access, false /* !sourceHasLLMs && shallFetchSucceed */, source);
return <>
+42 -36
View File
@@ -1,13 +1,14 @@
import { backendCaps } from '~/modules/backend/state-backend';
import { OllamaIcon } from '~/common/components/icons/OllamaIcon';
import { apiAsync } from '~/common/util/trpc.client';
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
import type { IModelVendor } from '../IModelVendor';
import type { VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
import type { OllamaAccessSchema } from '../../server/ollama/ollama.router';
import type { VChatMessageOut } from '../../llm.client';
import { unifiedStreamingClient } from '../unifiedStreamingClient';
import { LLMOptionsOpenAI } from '../openai/openai.vendor';
import type { LLMOptionsOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { OllamaSourceSetup } from './OllamaSourceSetup';
@@ -18,7 +19,7 @@ export interface SourceSetupOllama {
}
export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI, OllamaAccessSchema> = {
export const ModelVendorOllama: IModelVendor<SourceSetupOllama, OllamaAccessSchema, LLMOptionsOpenAI> = {
id: 'ollama',
name: 'Ollama',
rank: 22,
@@ -32,40 +33,45 @@ export const ModelVendorOllama: IModelVendor<SourceSetupOllama, LLMOptionsOpenAI
LLMOptionsComponent: OpenAILLMOptions,
// functions
getAccess: (partialSetup): OllamaAccessSchema => ({
getTransportAccess: (partialSetup): OllamaAccessSchema => ({
dialect: 'ollama',
ollamaHost: partialSetup?.ollamaHost || '',
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return ollamaCallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, maxTokens);
// List Models
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
return apiQuery.llmOllama.listModels.useQuery({ access }, {
enabled: enabled,
onSuccess: onSuccess,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
},
callChatGenerateWF(): Promise<VChatMessageOrFunctionCallOut> {
throw new Error('Ollama does not support "Functions" yet');
// Chat Generate (non-streaming) with Functions
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
if (functions?.length || forceFunctionName)
throw new Error('Ollama does not support functions');
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmOllama.chatGenerate.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
history: messages,
}) as VChatMessageOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
console.error(`ollama.rpcChatGenerateOrThrow: ${errorMessage}`);
throw new Error(errorMessage);
}
},
// Chat Generate (streaming) with Functions
streamingChatGenerateOrThrow: unifiedStreamingClient,
};
/**
* This function either returns the LLM message, or throws a descriptive error string
*/
async function ollamaCallChatGenerate<TOut = VChatMessageOut>(
access: OllamaAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
maxTokens?: number,
): Promise<TOut> {
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmOllama.chatGenerate.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
history: messages,
}) as TOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'Ollama Chat Generate Error';
console.error(`ollamaCallChatGenerate: ${errorMessage}`);
throw new Error(errorMessage);
}
}
+6 -12
View File
@@ -6,10 +6,10 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { ModelVendorOoobabooga } from './oobabooga.vendor';
@@ -18,20 +18,14 @@ export function OobaboogaSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceHasLLMs, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorOoobabooga.getAccess);
useSourceSetup(props.sourceId, ModelVendorOoobabooga);
// derived state
const { oaiHost } = access;
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: false, // !hasModels && !!asValidURL(normSetup.oaiHost),
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorOoobabooga, access, false /* !hasModels && !!asValidURL(normSetup.oaiHost) */, source);
return <>
+9 -11
View File
@@ -1,10 +1,9 @@
import { OobaboogaIcon } from '~/common/components/icons/OobaboogaIcon';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { OobaboogaSourceSetup } from './OobaboogaSourceSetup';
@@ -14,7 +13,7 @@ export interface SourceSetupOobabooga {
oaiHost: string; // use OpenAI-compatible non-default hosts (full origin path)
}
export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOptionsOpenAI, OpenAIAccessSchema> = {
export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'oobabooga',
name: 'Oobabooga',
rank: 25,
@@ -30,7 +29,7 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
initializeSetup: (): SourceSetupOobabooga => ({
oaiHost: 'http://127.0.0.1:5000',
}),
getAccess: (partialSetup): OpenAIAccessSchema => ({
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
dialect: 'oobabooga',
oaiKey: '',
oaiOrg: '',
@@ -38,10 +37,9 @@ export const ModelVendorOoobabooga: IModelVendor<SourceSetupOobabooga, LLMOption
heliKey: '',
moderationCheck: false,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
},
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
},
// OpenAI transport (oobabooga dialect in 'access')
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
};
+7 -41
View File
@@ -9,13 +9,13 @@ import { FormTextField } from '~/common/components/forms/FormTextField';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { useToggleableBoolean } from '~/common/util/useToggleableBoolean';
import type { ModelDescriptionSchema } from '../../transports/server/server.schemas';
import { DLLM, DModelSource, DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { isValidOpenAIApiKey, LLMOptionsOpenAI, ModelVendorOpenAI } from './openai.vendor';
import { isValidOpenAIApiKey, ModelVendorOpenAI } from './openai.vendor';
// avoid repeating it all over
@@ -29,7 +29,7 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceHasLLMs, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorOpenAI.getAccess);
useSourceSetup(props.sourceId, ModelVendorOpenAI);
// derived state
const { oaiKey, oaiOrg, oaiHost, heliKey, moderationCheck } = access;
@@ -40,15 +40,8 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorOpenAI, access, !sourceHasLLMs && shallFetchSucceed, source);
return <>
@@ -110,30 +103,3 @@ export function OpenAISourceSetup(props: { sourceId: DModelSourceId }) {
</>;
}
export function modelDescriptionToDLLM<TSourceSetup>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, LLMOptionsOpenAI> {
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
return {
id: `${source.id}-${model.id}`,
label: model.label,
created: model.created || 0,
updated: model.updated || 0,
description: model.description,
tags: [], // ['stream', 'chat'],
contextTokens: model.contextWindow,
maxOutputTokens: maxOutputTokens,
hidden: !!model.hidden,
sId: source.id,
_source: source,
options: {
llmRef: model.id,
llmTemperature: 0.5,
llmResponseTokens: llmResponseTokens,
},
};
}
+40 -40
View File
@@ -1,11 +1,12 @@
import { backendCaps } from '~/modules/backend/state-backend';
import { OpenAIIcon } from '~/common/components/icons/OpenAIIcon';
import { apiAsync } from '~/common/util/trpc.client';
import { apiAsync, apiQuery } from '~/common/util/trpc.client';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import type { VChatMessageOrFunctionCallOut } from '../../llm.client';
import { unifiedStreamingClient } from '../unifiedStreamingClient';
import { OpenAILLMOptions } from './OpenAILLMOptions';
import { OpenAISourceSetup } from './OpenAISourceSetup';
@@ -28,7 +29,7 @@ export interface LLMOptionsOpenAI {
llmResponseTokens: number;
}
export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI, OpenAIAccessSchema> = {
export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'openai',
name: 'OpenAI',
rank: 10,
@@ -42,7 +43,7 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
LLMOptionsComponent: OpenAILLMOptions,
// functions
getAccess: (partialSetup): OpenAIAccessSchema => ({
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
dialect: 'openai',
oaiKey: '',
oaiOrg: '',
@@ -51,41 +52,40 @@ export const ModelVendorOpenAI: IModelVendor<SourceSetupOpenAI, LLMOptionsOpenAI
moderationCheck: false,
...partialSetup,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
const access = this.getAccess(llm._source.setup);
return openAICallChatGenerate(access, llm.options, messages, null, null, maxTokens);
// List Models
rpcUpdateModelsQuery: (access, enabled, onSuccess) => {
return apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: enabled,
onSuccess: onSuccess,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
},
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
const access = this.getAccess(llm._source.setup);
return openAICallChatGenerate(access, llm.options, messages, functions, forceFunctionName, maxTokens);
// Chat Generate (non-streaming) with Functions
rpcChatGenerateOrThrow: async (access, llmOptions, messages, functions, forceFunctionName, maxTokens) => {
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
functions: functions ?? undefined,
forceFunctionName: forceFunctionName ?? undefined,
history: messages,
}) as VChatMessageOrFunctionCallOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
console.error(`openai.rpcChatGenerateOrThrow: ${errorMessage}`);
throw new Error(errorMessage);
}
},
// Chat Generate (streaming) with Functions
streamingChatGenerateOrThrow: unifiedStreamingClient,
};
/**
* This function either returns the LLM message, or function calls, or throws a descriptive error string
*/
export async function openAICallChatGenerate<TOut = VChatMessageOut | VChatMessageOrFunctionCallOut>(
access: OpenAIAccessSchema, llmOptions: Partial<LLMOptionsOpenAI>, messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
maxTokens?: number,
): Promise<TOut> {
const { llmRef, llmTemperature = 0.5, llmResponseTokens } = llmOptions;
try {
return await apiAsync.llmOpenAI.chatGenerateWithFunctions.mutate({
access,
model: {
id: llmRef!,
temperature: llmTemperature,
maxTokens: maxTokens || llmResponseTokens || 1024,
},
functions: functions ?? undefined,
forceFunctionName: forceFunctionName ?? undefined,
history: messages,
}) as TOut;
} catch (error: any) {
const errorMessage = error?.message || error?.toString() || 'OpenAI Chat Generate Error';
console.error(`openAICallChatGenerate: ${errorMessage}`);
throw new Error(errorMessage);
}
}
+40 -21
View File
@@ -1,15 +1,16 @@
import * as React from 'react';
import { Typography } from '@mui/joy';
import { Button, Typography } from '@mui/joy';
import { FormInputKey } from '~/common/components/forms/FormInputKey';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { getCallbackUrl } from '~/common/app.routes';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
import { DModelSourceId } from '../../store-llms';
import { useLlmUpdateModels } from '../useLlmUpdateModels';
import { useSourceSetup } from '../useSourceSetup';
import { isValidOpenRouterKey, ModelVendorOpenRouter } from './openrouter.vendor';
@@ -18,7 +19,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
// external state
const { source, sourceHasLLMs, access, updateSetup } =
useSourceSetup(props.sourceId, ModelVendorOpenRouter.getAccess);
useSourceSetup(props.sourceId, ModelVendorOpenRouter);
// derived state
const { oaiKey } = access;
@@ -29,31 +30,33 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
const shallFetchSucceed = oaiKey ? keyValid : !needsUserKey;
// fetch models
const { isFetching, refetch, isError, error } = apiQuery.llmOpenAI.listModels.useQuery({ access }, {
enabled: !sourceHasLLMs && shallFetchSucceed,
onSuccess: models => source && useModelsStore.getState().setLLMs(
models.models.map(model => modelDescriptionToDLLM(model, source)),
props.sourceId,
),
staleTime: Infinity,
});
const { isFetching, refetch, isError, error } =
useLlmUpdateModels(ModelVendorOpenRouter, access, !sourceHasLLMs && shallFetchSucceed, source);
const handleOpenRouterLogin = () => {
// replace the current page with the OAuth page
const callbackUrl = getCallbackUrl('openrouter');
const oauthUrl = 'https://openrouter.ai/auth?callback_url=' + encodeURIComponent(callbackUrl);
window.open(oauthUrl, '_self');
// ...bye / see you soon at the callback location...
};
return <>
{/*<Box sx={{ display: 'flex', gap: 1, alignItems: 'center' }}>*/}
{/*<OpenRouterIcon />*/}
<Typography level='body-sm'>
<Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent, premium service
<Link href='https://openrouter.ai/keys' target='_blank'>OpenRouter</Link> is an independent service
granting access to <Link href='https://openrouter.ai/docs#models' target='_blank'>exclusive models</Link> such
as GPT-4 32k, Claude, and more, typically unavailable to the public. <Link
href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md'>Configuration &amp; documentation</Link>.
as GPT-4 32k, Claude, and more. <Link
href='https://github.com/enricoros/big-agi/blob/main/docs/config-openrouter.md' target='_blank'>
Configuration &amp; documentation</Link>.
</Typography>
{/*</Box>*/}
<FormInputKey
id='openrouter-key' label='OpenRouter API Key'
rightLabel={<>{needsUserKey
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>create key</Link>
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
: '✔️ already set in server'
} {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
</>}
@@ -62,7 +65,23 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
placeholder='sk-or-...'
/>
<SetupFormRefetchButton refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError} />
<Typography level='body-sm'>
🎁 A selection of <Link href='https://openrouter.ai/docs#models' target='_blank'>OpenRouter models</Link> are
made available without charge. You can get an API key by using the Login button below.
</Typography>
<SetupFormRefetchButton
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
leftButton={
<Button
color='neutral' variant={(needsUserKey && !keyValid) ? 'solid' : 'outlined'}
onClick={handleOpenRouterLogin}
endDecorator={(needsUserKey && !keyValid) ? '🎁' : undefined}
>
OpenRouter Login
</Button>
}
/>
{isError && <InlineError error={error} />}
+10 -11
View File
@@ -3,10 +3,9 @@ import { backendCaps } from '~/modules/backend/state-backend';
import { OpenRouterIcon } from '~/common/components/icons/OpenRouterIcon';
import type { IModelVendor } from '../IModelVendor';
import type { OpenAIAccessSchema } from '../../transports/server/openai/openai.router';
import type { VChatFunctionIn, VChatMessageIn, VChatMessageOrFunctionCallOut, VChatMessageOut } from '../../transports/chatGenerate';
import type { OpenAIAccessSchema } from '../../server/openai/openai.router';
import { LLMOptionsOpenAI, openAICallChatGenerate } from '../openai/openai.vendor';
import { LLMOptionsOpenAI, ModelVendorOpenAI } from '../openai/openai.vendor';
import { OpenAILLMOptions } from '../openai/OpenAILLMOptions';
import { OpenRouterSourceSetup } from './OpenRouterSourceSetup';
@@ -32,12 +31,13 @@ export interface SourceSetupOpenRouter {
* [x] decide whether to do UI work to improve the appearance - prioritized models
* [x] works!
*/
export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptionsOpenAI, OpenAIAccessSchema> = {
export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, OpenAIAccessSchema, LLMOptionsOpenAI> = {
id: 'openrouter',
name: 'OpenRouter',
rank: 12,
location: 'cloud',
instanceLimit: 1,
hasFreeModels: true,
hasBackendCap: () => backendCaps().hasLlmOpenRouter,
// components
@@ -50,7 +50,7 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
oaiHost: 'https://openrouter.ai/api',
oaiKey: '',
}),
getAccess: (partialSetup): OpenAIAccessSchema => ({
getTransportAccess: (partialSetup): OpenAIAccessSchema => ({
dialect: 'openrouter',
oaiKey: partialSetup?.oaiKey || '',
oaiOrg: '',
@@ -58,10 +58,9 @@ export const ModelVendorOpenRouter: IModelVendor<SourceSetupOpenRouter, LLMOptio
heliKey: '',
moderationCheck: false,
}),
callChatGenerate(llm, messages: VChatMessageIn[], maxTokens?: number): Promise<VChatMessageOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, null, null, maxTokens);
},
callChatGenerateWF(llm, messages: VChatMessageIn[], functions: VChatFunctionIn[] | null, forceFunctionName: string | null, maxTokens?: number): Promise<VChatMessageOrFunctionCallOut> {
return openAICallChatGenerate(this.getAccess(llm._source.setup), llm.options, messages, functions, forceFunctionName, maxTokens);
},
// OpenAI transport ('openrouter' dialect in 'access')
rpcUpdateModelsQuery: ModelVendorOpenAI.rpcUpdateModelsQuery,
rpcChatGenerateOrThrow: ModelVendorOpenAI.rpcChatGenerateOrThrow,
streamingChatGenerateOrThrow: ModelVendorOpenAI.streamingChatGenerateOrThrow,
};
@@ -1,11 +1,10 @@
import { apiAsync } from '~/common/util/trpc.client';
import type { DLLM, DLLMId } from '../store-llms';
import { findVendorForLlmOrThrow } from '../vendors/vendor.registry';
import type { ChatStreamingFirstOutputPacketSchema, ChatStreamingInputSchema } from '../server/llm.server.streaming';
import type { DLLMId } from '../store-llms';
import type { VChatFunctionIn, VChatMessageIn } from '../llm.client';
import type { ChatStreamFirstPacketSchema, ChatStreamInputSchema } from './server/openai/openai.streaming';
import type { OpenAIWire } from './server/openai/openai.wiretypes';
import type { VChatMessageIn } from './chatGenerate';
import type { OpenAIWire } from '../server/openai/openai.wiretypes';
/**
@@ -15,27 +14,14 @@ import type { VChatMessageIn } from './chatGenerate';
* Vendor-specific implementation is on our server backend (API) code. This function tries to be
* as generic as possible.
*
* @param llmId LLM to use
* @param messages the history of messages to send to the API endpoint
* @param abortSignal used to initiate a client-side abort of the fetch request to the API endpoint
* @param onUpdate callback when a piece of a message (text, model name, typing..) is received
* NOTE: onUpdate is callback when a piece of a message (text, model name, typing..) is received
*/
export async function streamChat(
export async function unifiedStreamingClient<TSourceSetup = unknown, TLLMOptions = unknown>(
access: ChatStreamingInputSchema['access'],
llmId: DLLMId,
llmOptions: TLLMOptions,
messages: VChatMessageIn[],
abortSignal: AbortSignal,
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
): Promise<void> {
const { llm, vendor } = findVendorForLlmOrThrow(llmId);
const access = vendor.getAccess(llm._source.setup) as ChatStreamInputSchema['access'];
return await vendorStreamChat(access, llm, messages, abortSignal, onUpdate);
}
async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
access: ChatStreamInputSchema['access'],
llm: DLLM<TSourceSetup, TLLMOptions>,
messages: VChatMessageIn[],
functions: VChatFunctionIn[] | null, forceFunctionName: string | null,
abortSignal: AbortSignal,
onUpdate: (update: Partial<{ text: string, typing: boolean, originLLM: string }>, done: boolean) => void,
) {
@@ -79,12 +65,12 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
}
// model params (llm)
const { llmRef, llmTemperature, llmResponseTokens } = (llm.options as any) || {};
const { llmRef, llmTemperature, llmResponseTokens } = (llmOptions as any) || {};
if (!llmRef || llmTemperature === undefined || llmResponseTokens === undefined)
throw new Error(`Error in configuration for model ${llm.id}: ${JSON.stringify(llm.options)}`);
throw new Error(`Error in configuration for model ${llmId}: ${JSON.stringify(llmOptions)}`);
// prepare the input, similarly to the tRPC openAI.chatGenerate
const input: ChatStreamInputSchema = {
const input: ChatStreamingInputSchema = {
access,
model: {
id: llmRef,
@@ -131,7 +117,7 @@ async function vendorStreamChat<TSourceSetup = unknown, TLLMOptions = unknown>(
incrementalText = incrementalText.substring(endOfJson + 1);
parsedFirstPacket = true;
try {
const parsed: ChatStreamFirstPacketSchema = JSON.parse(json);
const parsed: ChatStreamingFirstOutputPacketSchema = JSON.parse(json);
onUpdate({ originLLM: parsed.model }, false);
} catch (e) {
// error parsing JSON, ignore
+47
View File
@@ -0,0 +1,47 @@
import type { IModelVendor } from './IModelVendor';
import type { ModelDescriptionSchema } from '../server/llm.server.types';
import { DLLM, DModelSource, useModelsStore } from '../store-llms';
/**
* Hook that fetches the list of models from the vendor and updates the store,
* while returning the fetch state.
*/
export function useLlmUpdateModels<TSourceSetup, TAccess, TLLMOptions>(vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>, access: TAccess, enabled: boolean, source: DModelSource<TSourceSetup>) {
return vendor.rpcUpdateModelsQuery(access, enabled, data => source && updateModelsFn(data, source));
}
function updateModelsFn<TSourceSetup>(data: { models: ModelDescriptionSchema[] }, source: DModelSource<TSourceSetup>) {
useModelsStore.getState().setLLMs(
data.models.map(model => modelDescriptionToDLLMOpenAIOptions(model, source)),
source.id,
);
}
function modelDescriptionToDLLMOpenAIOptions<TSourceSetup, TLLMOptions>(model: ModelDescriptionSchema, source: DModelSource<TSourceSetup>): DLLM<TSourceSetup, TLLMOptions> {
const maxOutputTokens = model.maxCompletionTokens || Math.round((model.contextWindow || 4096) / 2);
const llmResponseTokens = Math.round(maxOutputTokens / (model.maxCompletionTokens ? 2 : 4));
return {
id: `${source.id}-${model.id}`,
label: model.label,
created: model.created || 0,
updated: model.updated || 0,
description: model.description,
tags: [], // ['stream', 'chat'],
contextTokens: model.contextWindow,
maxOutputTokens: maxOutputTokens,
hidden: !!model.hidden,
sId: source.id,
_source: source,
options: {
llmRef: model.id,
// @ts-ignore FIXME: large assumption that this is LLMOptionsOpenAI object
llmTemperature: 0.5,
llmResponseTokens: llmResponseTokens,
},
};
}
+35
View File
@@ -0,0 +1,35 @@
import { shallow } from 'zustand/shallow';
import type { IModelVendor } from './IModelVendor';
import { DModelSource, DModelSourceId, useModelsStore } from '../store-llms';
/**
* Source-specific read/write - great time saver
*/
export function useSourceSetup<TSourceSetup, TAccess, TLLMOptions>(sourceId: DModelSourceId, vendor: IModelVendor<TSourceSetup, TAccess, TLLMOptions>) {
// invalidates only when the setup changes
const { updateSourceSetup, ...rest } = useModelsStore(state => {
// find the source (or null)
const source: DModelSource<TSourceSetup> | null = state.sources.find(source => source.id === sourceId) as DModelSource<TSourceSetup> ?? null;
// (safe) source-derived properties
const sourceSetupValid = (source?.setup && vendor?.validateSetup) ? vendor.validateSetup(source.setup as TSourceSetup) : false;
const sourceLLMs = source ? state.llms.filter(llm => llm._source === source) : [];
const access = vendor.getTransportAccess(source?.setup);
return {
source,
access,
sourceHasLLMs: !!sourceLLMs.length,
sourceSetupValid,
updateSourceSetup: state.updateSourceSetup,
};
}, shallow);
// convenience function for this source
const updateSetup = (partialSetup: Partial<TSourceSetup>) => updateSourceSetup<TSourceSetup>(sourceId, partialSetup);
return { ...rest, updateSetup };
}
@@ -1,24 +1,39 @@
import { ModelVendorAnthropic } from './anthropic/anthropic.vendor';
import { ModelVendorAzure } from './azure/azure.vendor';
import { ModelVendorGemini } from './gemini/gemini.vendor';
import { ModelVendorLocalAI } from './localai/localai.vendor';
import { ModelVendorMistral } from './mistral/mistral.vendor';
import { ModelVendorOllama } from './ollama/ollama.vendor';
import { ModelVendorOoobabooga } from './oobabooga/oobabooga.vendor';
import { ModelVendorOpenAI } from './openai/openai.vendor';
import { ModelVendorOpenRouter } from './openrouter/openrouter.vendor';
import type { IModelVendor } from './IModelVendor';
import { DLLMId, DModelSource, DModelSourceId, findLLMOrThrow } from '../store-llms';
import { IModelVendor, ModelVendorId } from './IModelVendor';
/** Vendor Instances Registry **/
export type ModelVendorId =
| 'anthropic'
| 'azure'
| 'googleai'
| 'localai'
| 'mistral'
| 'ollama'
| 'oobabooga'
| 'openai'
| 'openrouter';
/** Global: Vendor Instances Registry **/
const MODEL_VENDOR_REGISTRY: Record<ModelVendorId, IModelVendor> = {
anthropic: ModelVendorAnthropic,
azure: ModelVendorAzure,
googleai: ModelVendorGemini,
localai: ModelVendorLocalAI,
mistral: ModelVendorMistral,
ollama: ModelVendorOllama,
oobabooga: ModelVendorOoobabooga,
openai: ModelVendorOpenAI,
openrouter: ModelVendorOpenRouter,
};
} as Record<string, IModelVendor>;
const MODEL_VENDOR_DEFAULT: ModelVendorId = 'openai';
@@ -29,13 +44,15 @@ export function findAllVendors(): IModelVendor[] {
return modelVendors;
}
export function findVendorById(vendorId?: ModelVendorId): IModelVendor | null {
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] ?? null) : null;
export function findVendorById<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(
vendorId?: ModelVendorId,
): IModelVendor<TSourceSetup, TAccess, TLLMOptions> | null {
return vendorId ? (MODEL_VENDOR_REGISTRY[vendorId] as IModelVendor<TSourceSetup, TAccess, TLLMOptions>) ?? null : null;
}
export function findVendorForLlmOrThrow(llmId: DLLMId) {
const llm = findLLMOrThrow(llmId);
const vendor = findVendorById(llm?._source.vId);
export function findVendorForLlmOrThrow<TSourceSetup = unknown, TAccess = unknown, TLLMOptions = unknown>(llmId: DLLMId) {
const llm = findLLMOrThrow<TSourceSetup, TLLMOptions>(llmId);
const vendor = findVendorById<TSourceSetup, TAccess, TLLMOptions>(llm?._source.vId);
if (!vendor) throw new Error(`callChat: Vendor not found for LLM ${llmId}`);
return { llm, vendor };
}
+5 -3
View File
@@ -3,9 +3,10 @@ import { createTRPCRouter } from './trpc.server';
import { backendRouter } from '~/modules/backend/backend.router';
import { elevenlabsRouter } from '~/modules/elevenlabs/elevenlabs.router';
import { googleSearchRouter } from '~/modules/google/search.router';
import { llmAnthropicRouter } from '~/modules/llms/transports/server/anthropic/anthropic.router';
import { llmOllamaRouter } from '~/modules/llms/transports/server/ollama/ollama.router';
import { llmOpenAIRouter } from '~/modules/llms/transports/server/openai/openai.router';
import { llmAnthropicRouter } from '~/modules/llms/server/anthropic/anthropic.router';
import { llmGeminiRouter } from '~/modules/llms/server/gemini/gemini.router';
import { llmOllamaRouter } from '~/modules/llms/server/ollama/ollama.router';
import { llmOpenAIRouter } from '~/modules/llms/server/openai/openai.router';
import { prodiaRouter } from '~/modules/prodia/prodia.router';
import { ytPersonaRouter } from '../../apps/personas/ytpersona.router';
@@ -17,6 +18,7 @@ export const appRouterEdge = createTRPCRouter({
elevenlabs: elevenlabsRouter,
googleSearch: googleSearchRouter,
llmAnthropic: llmAnthropicRouter,
llmGemini: llmGeminiRouter,
llmOllama: llmOllamaRouter,
llmOpenAI: llmOpenAIRouter,
prodia: prodiaRouter,
+11 -2
View File
@@ -5,8 +5,8 @@ export const env = createEnv({
server: {
// Backend Postgres, for optional storage via Prisma
POSTGRES_PRISMA_URL: z.string().url().optional(),
POSTGRES_URL_NON_POOLING: z.string().url().optional(),
POSTGRES_PRISMA_URL: z.string().optional(),
POSTGRES_URL_NON_POOLING: z.string().optional(),
// LLM: OpenAI
OPENAI_API_KEY: z.string().optional(),
@@ -21,6 +21,12 @@ export const env = createEnv({
ANTHROPIC_API_KEY: z.string().optional(),
ANTHROPIC_API_HOST: z.string().url().optional(),
// LLM: Google AI's Gemini
GEMINI_API_KEY: z.string().optional(),
// LLM: Mistral
MISTRAL_API_KEY: z.string().optional(),
// LLM: Ollama
OLLAMA_API_HOST: z.string().url().optional(),
@@ -59,6 +65,9 @@ export const env = createEnv({
throw new Error('Invalid environment variable');
},
// matches user expectations - see https://github.com/enricoros/big-AGI/issues/279
emptyStringAsUndefined: true,
// with Noext.JS >= 13.4.4 we'd only need to destructure client variables
experimental__runtimeEnv: {},
});