Compare commits

...

14 Commits

Author SHA1 Message Date
Enrico Ros 6053636f66 OpenRouter: OAuth login support 2023-12-11 22:35:40 -08:00
Enrico Ros f2e2aee672 1.7.2: Stable Patch Version 2023-12-11 21:22:31 -08:00
Enrico Ros 11cbb2bbf0 OpenRouter: update models 2023-12-11 21:21:22 -08:00
Enrico Ros 30bd19d6ce HTML Table to Markdown Table: improve reliability and ignore hidden data 2023-12-11 20:46:34 -08:00
Enrico Ros d0b5c02062 Improve how Stream errors are shown 2023-12-11 18:22:15 -08:00
Enrico Ros 771192e406 Ollama: support ollama errors via API 2023-12-11 18:19:38 -08:00
Enrico Ros 13f502bd76 1.7.1: Release (Ollama chat). #270 2023-12-10 22:17:35 -08:00
Enrico Ros 11055b12ca Ollama: use the new Chat endpoint. Closes #270 2023-12-10 22:12:51 -08:00
Enrico Ros d0ea96eec0 Ollama: Admin: optional sort by Pulls, and UI link to the Model page 2023-12-10 22:03:55 -08:00
Enrico Ros 02eafc03f1 Ollama: update models, and sort by Featured 2023-12-10 22:01:50 -08:00
Enrico Ros 33d07a0313 Ollama: update documentation 2023-12-10 21:30:30 -08:00
Enrico Ros 763b852148 Ollama: administration: external link 2023-12-10 20:24:20 -08:00
Enrico Ros d5b0617fd7 Comment for now 2023-12-10 06:14:49 -08:00
Enrico Ros e3ce83674c Update Ollama 2023-12-10 06:09:54 -08:00
21 changed files with 539 additions and 167 deletions
+3 -1
View File
@@ -21,7 +21,7 @@ shows the current developments and future ideas.
- Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
- Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_
### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
### What's New in 1.7.2 · Dec 12, 2023 · Attachment Theory 🌟
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -31,6 +31,8 @@ shows the current developments and future ideas.
- Optimized Voice Input and Performance
- Latest Ollama and Oobabooga models
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
- [1.7.2]: Updated OpenRouter models (incl. Mixtral 8x7B)
### What's New in 1.6.0 - Nov 28, 2023
+3 -1
View File
@@ -10,7 +10,7 @@ by release.
- work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
- milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)
### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
### What's New in 1.7.2 · Dec 11, 2023 · Attachment Theory 🌟
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -20,6 +20,8 @@ by release.
- Optimized Voice Input and Performance
- Latest Ollama and Oobabooga models
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
- [1.7.2]: Updated OpenRouter models (incl. Mixtral 8x7B)
### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up
+10 -5
View File
@@ -5,15 +5,20 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
editing tools, models switching, personas, and more.
_Last updated Dec 11, 2023_
![config-local-ollama-0-example.png](pixels/config-ollama-0-example.png)
## Quick Integration Guide
1. **Ensure Ollama API Server is Running**: Before starting, make sure your Ollama API server is up and running.
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**.
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`).
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models.
5. **Start Using AI Personas**: Select an Ollama model and begin interacting with AI personas tailored to your needs.
1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
> Optional: use the Ollama Admin interface to see which models are available and 'Pull' them in your local machine. Note
that this operation will likely timeout due to Edge Functions timeout on the big-AGI server while pulling, and
you'll have to press the 'Pull' button again, until a green message appears.
5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas
### Ollama: installation and Setup
+2 -2
View File
@@ -1,12 +1,12 @@
{
"name": "big-agi",
"version": "1.7.0",
"version": "1.7.2",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "big-agi",
"version": "1.7.0",
"version": "1.7.2",
"hasInstallScript": true,
"dependencies": {
"@dqbd/tiktoken": "^1.0.7",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "big-agi",
"version": "1.7.0",
"version": "1.7.2",
"private": true,
"scripts": {
"dev": "next dev",
+98
View File
@@ -0,0 +1,98 @@
import * as React from 'react';
import { useRouter } from 'next/router';
import { Box, Typography } from '@mui/joy';
import { useModelsStore } from '~/modules/llms/store-llms';
import { AppLayout } from '~/common/layout/AppLayout';
import { InlineError } from '~/common/components/InlineError';
import { apiQuery } from '~/common/util/trpc.client';
import { navigateToIndex } from '~/common/app.routes';
import { openLayoutModelsSetup } from '~/common/layout/store-applayout';
function CallbackOpenRouterPage(props: { openRouterCode: string | undefined }) {
// external state
const { data, isError, error, isLoading } = apiQuery.backend.exchangeOpenRouterKey.useQuery({ code: props.openRouterCode || '' }, {
enabled: !!props.openRouterCode,
refetchOnWindowFocus: false,
staleTime: Infinity,
});
// derived state
const isErrorInput = !props.openRouterCode;
const openRouterKey = data?.key ?? undefined;
const isSuccess = !!openRouterKey;
// Success: save the key and redirect to the chat app
React.useEffect(() => {
if (!isSuccess)
return;
// 1. Save the key as the client key
useModelsStore.getState().setOpenRoutersKey(openRouterKey);
// 2. Navigate to the chat app
navigateToIndex(true).then(() => openLayoutModelsSetup());
}, [isSuccess, openRouterKey]);
return (
<Box sx={{
flexGrow: 1,
backgroundColor: 'background.level1',
overflowY: 'auto',
display: 'flex', justifyContent: 'center',
p: { xs: 3, md: 6 },
}}>
<Box sx={{
// my: 'auto',
display: 'flex', flexDirection: 'column', alignItems: 'center',
gap: 4,
}}>
<Typography level='title-lg'>
Welcome Back
</Typography>
{isLoading && <Typography level='body-sm'>Loading...</Typography>}
{isErrorInput && <InlineError error='There was an issue retrieving the code from OpenRouter.' />}
{isError && <InlineError error={error} />}
{data && (
<Typography level='body-md'>
Success! You can now close this window.
</Typography>
)}
</Box>
</Box>
);
}
/**
* This page will be invoked by OpenRouter as a Callback
*
* Docs: https://openrouter.ai/docs#oauth
* Example URL: https://localhost:3000/link/callback_openrouter?code=SomeCode
*/
export default function Page() {
// get the 'code=...' from the URL
const { query } = useRouter();
const { code: openRouterCode } = query;
return (
<AppLayout suspendAutoModelsSetup>
<CallbackOpenRouterPage openRouterCode={openRouterCode as (string | undefined)} />
</AppLayout>
);
}
@@ -254,7 +254,7 @@ export async function attachmentPerformConversion(attachment: Readonly<Attachmen
case 'rich-text-table':
let mdTable: string;
try {
mdTable = htmlTableToMarkdown(input.altData!);
mdTable = htmlTableToMarkdown(input.altData!, false);
} catch (error) {
// fallback to text/plain
mdTable = inputDataToString(input.data);
+5 -2
View File
@@ -67,9 +67,10 @@ export const NewsItems: NewsItem[] = [
],
},*/
{
versionCode: '1.7.0',
versionCode: '1.7.2',
versionName: 'Attachment Theory',
versionDate: new Date('2023-12-10T12:00:00Z'), // new Date().toISOString()
versionDate: new Date('2023-12-11T06:00:00Z'), // new Date().toISOString()
// versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
items: [
{ text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
{ text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
@@ -79,6 +80,8 @@ export const NewsItems: NewsItem[] = [
{ text: <>{platformAwareKeystrokes('Ctrl+Shift+O')}: quick access to model options</> },
{ text: <>Optimized voice input and performance</> },
{ text: <>Latest Ollama and Oobabooga models</> },
{ text: <>1.7.1: Improved <B href={RIssues + '/270'}>Ollama chats</B></> },
{ text: <>1.7.2: Updated OpenRouter models</> },
],
},
{
+13
View File
@@ -13,9 +13,22 @@ export const ROUTE_INDEX = '/';
export const ROUTE_APP_CHAT = '/';
export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
export const ROUTE_APP_NEWS = '/news';
const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';
export const getIndexLink = () => ROUTE_INDEX;
export const getCallbackUrl = (source: 'openrouter') => {
const callbackUrl = new URL(window.location.href);
switch (source) {
case 'openrouter':
callbackUrl.pathname = ROUTE_CALLBACK_OPENROUTER;
break;
default:
throw new Error(`Unknown source: ${source}`);
}
return callbackUrl.toString();
};
export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);
const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
+1
View File
@@ -46,6 +46,7 @@ export const appTheme = extendTheme({
text: {
icon: 'var(--joy-palette-neutral-700)', // <IconButton color='neutral' /> icon color
secondary: 'var(--joy-palette-neutral-800)', // increase contrast a bit
// tertiary: 'var(--joy-palette-neutral-700)', // increase contrast a bit
},
// popup [white] > surface [50] > level1 [100] > level2 [200] > level3 [300] > body [white -> 400]
background: {
+40 -5
View File
@@ -2,11 +2,13 @@
* @fileoverview Utility functions for Markdown.
*/
import { isBrowser } from '~/common/util/pwaUtils';
/**
* Quick and dirty conversion of HTML tables to Markdown tables.
* Big plus: doesn't require any dependencies.
*/
export function htmlTableToMarkdown(html: string): string {
export function htmlTableToMarkdown(html: string, includeInvisible: boolean): string {
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');
const table = doc.querySelector('table');
@@ -16,20 +18,53 @@ export function htmlTableToMarkdown(html: string): string {
const headerCells = table.querySelectorAll('thead th');
if (headerCells.length > 0) {
const headerRow = '| ' + Array.from(headerCells)
.map(cell => cell.textContent?.trim() || '')
.join(' | ') + '| ';
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
.join(' | ') + ' |';
markdownRows.push(headerRow);
markdownRows.push('|:' + Array(headerCells.length).fill('-').join('|:') + '|');
markdownRows.push('|:' + Array(headerCells.length).fill('---').join('|:') + '|');
}
const bodyRows = table.querySelectorAll('tbody tr');
for (const row of Array.from(bodyRows)) {
const rowCells = row.querySelectorAll('td');
const markdownRow = '| ' + Array.from(rowCells)
.map(cell => cell.textContent?.trim() || '')
.map(cell => getTextWithSpaces(cell, includeInvisible).trim())
.join(' | ') + ' |';
markdownRows.push(markdownRow);
}
return markdownRows.join('\n');
}
// Helper function to get text with spaces, ignoring hidden elements
function getTextWithSpaces(node: Node, includeInvisible: boolean): string {
let text = '';
node.childNodes.forEach(child => {
if (child.nodeType === Node.TEXT_NODE)
text += child.textContent;
else if (child.nodeType === Node.ELEMENT_NODE)
if (includeInvisible || isVisible(child as Element))
text += ' ' + getTextWithSpaces(child, includeInvisible) + ' ';
});
return text;
}
// Helper function to determine if an element is visible
function isVisible(element: Element): boolean {
if (!isBrowser) return true;
// if the cell is hidden, don't include it
const style = window.getComputedStyle(element);
if (style.display === 'none' || style.visibility === 'hidden')
return false;
// Check for common classes used to hide content or indicate tooltip/popover content.
// You may need to add more classes here based on your actual HTML/CSS.
const ignoredClasses = ['hidden', 'group-hover', 'tooltip', 'pointer-events-none', 'opacity-0'];
for (const ignoredClass of ignoredClasses)
if (element.classList.contains(ignoredClass))
return false;
// Otherwise, the element is considered visible
return true;
}
+16
View File
@@ -1,5 +1,8 @@
import { z } from 'zod';
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
import { env } from '~/server/env.mjs';
import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';
import { analyticsListCapabilities } from './backend.analytics';
@@ -30,4 +33,17 @@ export const backendRouter = createTRPCRouter({
};
}),
// The following are used for various OAuth integrations
/* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
exchangeOpenRouterKey: publicProcedure
.input(z.object({ code: z.string() }))
.query(async ({ ctx, input }) => {
// Documented here: https://openrouter.ai/docs#oauth
return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
code: input.code,
}, 'Backend.exchangeOpenRouterKey');
}),
});
+18 -5
View File
@@ -2,7 +2,8 @@ import { create } from 'zustand';
import { shallow } from 'zustand/shallow';
import { persist } from 'zustand/middleware';
import { ModelVendorId } from './vendors/IModelVendor';
import type { ModelVendorId } from './vendors/IModelVendor';
import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';
/**
@@ -76,6 +77,9 @@ interface ModelsActions {
setChatLLMId: (id: DLLMId | null) => void;
setFastLLMId: (id: DLLMId | null) => void;
setFuncLLMId: (id: DLLMId | null) => void;
// special
setOpenRoutersKey: (key: string) => void;
}
type LlmsStore = ModelsData & ModelsActions;
@@ -162,13 +166,22 @@ export const useModelsStore = create<LlmsStore>()(
set(state => ({
sources: state.sources.map((source: DModelSource): DModelSource =>
source.id === id
? {
...source,
setup: { ...source.setup, ...partialSetup },
} : source,
? { ...source, setup: { ...source.setup, ...partialSetup } }
: source,
),
})),
setOpenRoutersKey: (key: string) =>
set(state => {
const openRouterSource = state.sources.find(source => source.vId === 'openrouter');
if (!openRouterSource) return state;
return {
sources: state.sources.map(source => source.id === openRouterSource.id
? { ...source, setup: { ...source.setup, oaiKey: key satisfies SourceSetupOpenRouter['oaiKey'] } }
: source),
};
}),
}),
{
name: 'app-models',
@@ -3,54 +3,57 @@
* descriptions for the models.
* (nor does it reliably provide context window sizes) - TODO: open a bug upstream
*
* from: https://ollama.ai/library?sort=popular
* from: https://ollama.ai/library?sort=featured
*/
export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 56100 },
'llama2': { description: 'The most popular model for general use.', pulls: 117400 },
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 61500 },
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 26800 },
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 23000 },
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 20600 },
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 12100 },
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 9760 },
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9002 },
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 8671 },
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8478 },
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 8142 },
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7426 },
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7035 },
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6140 },
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 5865 },
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5534 },
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 4696 },
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4275 },
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4227 },
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 3663, added: '20231129' },
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3343 },
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 3134, added: '20231129' },
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3050 },
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 2981 },
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 2636 },
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2328 },
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 2281, added: '20231129' },
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 2101 },
'yi': { description: 'A high-performing, bilingual base model.', pulls: 1806 },
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 1803 },
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1605 },
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks.', pulls: 1584 },
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1560 },
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 1338, added: '20231129' },
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1253 },
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1163 },
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1099 },
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1042 },
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 728, added: '20231129' },
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 593 },
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 585 },
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 573, added: '20231129' },
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 446, added: '20231129' },
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 100, added: '20231129' },
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 11, added: '20231129' },
'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 2353, added: '20231129' },
'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 3089, added: '20231129' },
'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 70300 },
'yi': { description: 'A high-performing, bilingual base model.', pulls: 2673 },
'llama2': { description: 'The most popular model for general use.', pulls: 141000 },
'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 71400 },
'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 30900 },
'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 26000 },
'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 21800 },
'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 13700 },
'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 10600 },
'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 10200 },
'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9895 },
'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9256 },
'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8827 },
'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7849 },
'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7375 },
'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 7335, added: '20231129' },
'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 6726 },
'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6272 },
'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5978 },
'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 5854, added: '20231129' },
'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5040 },
'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4648 },
'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4536 },
'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 3638 },
'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 3638 },
'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3485 },
'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 3438, added: '20231129' },
'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3145 },
'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3023 },
'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2775 },
'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2192 },
'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 1973 },
'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1915 },
'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1690 },
'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 1667, added: '20231129' },
'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1379 },
'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1345 },
'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1318, added: '20231129' },
'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1302 },
'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1254 },
'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 946, added: '20231129' },
'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 945, added: '20231210' },
'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 860 },
'magicoder': { description: '🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.', pulls: 816, added: '20231210' },
'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 804, added: '20231129' },
'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 706 },
};
export const OLLAMA_LAST_UPDATE: string = '20231129';
// export const OLLAMA_LAST_UPDATE: string = '20231210';
export const OLLAMA_PREV_UPDATE: string = '20231129';
@@ -1,4 +1,5 @@
import { z } from 'zod';
import { TRPCError } from '@trpc/server';
import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
import { env } from '~/server/env.mjs';
@@ -11,12 +12,15 @@ import { capitalizeFirstLetter } from '~/common/util/textUtils';
import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';
import { OLLAMA_BASE_MODELS, OLLAMA_LAST_UPDATE } from './ollama.models';
import { wireOllamaGenerationSchema } from './ollama.wiretypes';
import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';
// Default hosts
const DEFAULT_OLLAMA_HOST = 'http://127.0.0.1:11434';
export const OLLAMA_PATH_CHAT = '/api/chat';
const OLLAMA_PATH_TAGS = '/api/tags';
const OLLAMA_PATH_SHOW = '/api/show';
// Mappers
@@ -34,7 +38,23 @@ export function ollamaAccess(access: OllamaAccessSchema, apiPath: string): { hea
}
export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
export const ollamaChatCompletionPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean): WireOllamaChatCompletionInput => ({
model: model.id,
messages: history,
options: {
...(model.temperature && { temperature: model.temperature }),
},
// n: ...
// functions: ...
// function_call: ...
stream,
});
/* Unused: switched to the Chat endpoint (above). The implementation is left here for reference.
https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion
export function ollamaCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
// if the first message is the system prompt, extract it
let systemPrompt: string | undefined = undefined;
@@ -62,7 +82,7 @@ export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: O
...(systemPrompt && { system: systemPrompt }),
stream,
};
}
}*/
async function ollamaGET<TOut extends object>(access: OllamaAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
const { headers, url } = ollamaAccess(access, apiPath);
@@ -104,6 +124,7 @@ const listPullableOutputSchema = z.object({
label: z.string(),
tag: z.string(),
description: z.string(),
pulls: z.number(),
isNew: z.boolean(),
})),
});
@@ -122,7 +143,8 @@ export const llmOllamaRouter = createTRPCRouter({
label: capitalizeFirstLetter(model_id),
tag: 'latest',
description: model.description,
isNew: !!model.added && model.added >= OLLAMA_LAST_UPDATE,
pulls: model.pulls,
isNew: !!model.added && model.added >= OLLAMA_PREV_UPDATE,
})),
};
}),
@@ -160,6 +182,7 @@ export const llmOllamaRouter = createTRPCRouter({
throw new Error('Ollama delete issue: ' + deleteOutput);
}),
/* Ollama: List the Models available */
listModels: publicProcedure
.input(accessOnlySchema)
@@ -167,7 +190,7 @@ export const llmOllamaRouter = createTRPCRouter({
.query(async ({ input }) => {
// get the models
const wireModels = await ollamaGET(input.access, '/api/tags');
const wireModels = await ollamaGET(input.access, OLLAMA_PATH_TAGS);
const wireOllamaListModelsSchema = z.object({
models: z.array(z.object({
name: z.string(),
@@ -180,7 +203,7 @@ export const llmOllamaRouter = createTRPCRouter({
// retrieve info for each of the models (/api/show, post call, in parallel)
const detailedModels = await Promise.all(models.map(async model => {
const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, '/api/show');
const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, OLLAMA_PATH_SHOW);
const wireOllamaModelInfoSchema = z.object({
license: z.string().optional(),
modelfile: z.string(),
@@ -221,12 +244,24 @@ export const llmOllamaRouter = createTRPCRouter({
.output(openAIChatGenerateOutputSchema)
.mutation(async ({ input: { access, history, model } }) => {
const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), '/api/generate');
const generation = wireOllamaGenerationSchema.parse(wireGeneration);
const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), OLLAMA_PATH_CHAT);
const generation = wireOllamaChunkedOutputSchema.parse(wireGeneration);
if ('error' in generation)
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Ollama chat-generation issue: ${generation.error}`,
});
if (!generation.message?.content)
throw new TRPCError({
code: 'INTERNAL_SERVER_ERROR',
message: `Ollama chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
});
return {
role: 'assistant',
content: generation.response,
content: generation.message.content,
finish_reason: generation.done ? 'stop' : null,
};
}),
@@ -1,16 +1,76 @@
import { z } from 'zod';
export const wireOllamaGenerationSchema = z.object({
model: z.string(),
// created_at: z.string(), // commented because unused
response: z.string(),
done: z.boolean(),
// only on the last message
// context: z.array(z.number()),
// total_duration: z.number(),
// load_duration: z.number(),
// eval_duration: z.number(),
// prompt_eval_count: z.number(),
// eval_count: z.number(),
/**
* Chat Completion API - Request
* https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion
*/
const wireOllamaChatCompletionInputSchema = z.object({
// required
model: z.string(),
messages: z.array(z.object({
role: z.enum(['assistant', 'system', 'user']),
content: z.string(),
})),
// optional
format: z.enum(['json']).optional(),
options: z.object({
// https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md
// Maximum number of tokens to predict when generating text.
num_predict: z.number().int().optional(),
// Sets the random number seed to use for generation
seed: z.number().int().optional(),
// The temperature of the model
temperature: z.number().positive().optional(),
// Reduces the probability of generating nonsense (Default: 40)
top_k: z.number().positive().optional(),
// Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text. (Default 0.9)
top_p: z.number().positive().optional(),
}).optional(),
template: z.string().optional(), // overrides what is defined in the Modelfile
stream: z.boolean().optional(), // default: true
// Future Improvements?
// n: z.number().int().optional(), // number of completions to generate
// functions: ...
// function_call: ...
});
export type WireOllamaChatCompletionInput = z.infer<typeof wireOllamaChatCompletionInputSchema>;
/**
* Chat Completion or Generation APIs - Streaming Response
*/
export const wireOllamaChunkedOutputSchema = z.union([
// Chat Completion Chunk
z.object({
model: z.string(),
// created_at: z.string(), // commented because unused
// [Chat Completion] (exclusive with 'response')
message: z.object({
role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
content: z.string(),
}).optional(), // optional on the last message
// [Generation] (non-chat, exclusive with 'message')
//response: z.string().optional(),
done: z.boolean(),
// only on the last message
// context: z.array(z.number()), // non-chat endpoint
// total_duration: z.number(),
// prompt_eval_count: z.number(),
// prompt_eval_duration: z.number(),
// eval_count: z.number(),
// eval_duration: z.number(),
}),
// Possible Error
z.object({
error: z.string(),
}),
]);
@@ -1,5 +1,6 @@
import type { ModelDescriptionSchema } from '../server.schemas';
import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
import { SERVER_DEBUG_WIRE } from '~/server/wire';
// [Azure] / [OpenAI]
@@ -236,8 +237,8 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
/**
* Created to reflect the doc page: https://openrouter.ai/docs
*
* Update prompt:
* "Please update the typescript object below (do not change the definition, just the object), based on the updated upstream documentation:"
* Update prompt (last updated 2023-12-12)
* "Please update the following typescript object (do not change the definition, just values, and do not miss any rows), based on the information provided thereafter:"
*
* fields:
* - cw: context window size (max tokens, total)
@@ -247,19 +248,24 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
*/
const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?: number; old?: boolean; unfilt?: boolean; } } = {
// 'openrouter/auto': { name: 'Auto (best for prompt)', cw: 128000, cp: undefined, cc: undefined, unfilt: undefined },
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B (beta)', cw: 4096, cp: 0, cc: 0, unfilt: true },
'openchat/openchat-7b': { name: 'OpenChat 7B (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
'undi95/toppy-m-7b': { name: 'Toppy M 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
'gryphe/mythomist-7b': { name: 'MythoMist 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B (beta)', cw: 4096, cp: 0.000155, cc: 0.000155, unfilt: true },
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct (beta)', cw: 8192, cp: 0.00045, cc: 0.00045, unfilt: true },
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2 (beta)', cw: 4096, cp: 0.00045, cc: 0.00045, unfilt: true },
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1 (beta)', cw: 32768, cp: 0.005, cc: 0.005, unfilt: true },
'haotian-liu/llava-13b': { name: 'Llava 13B (beta)', cw: 2048, cp: 0.005, cc: 0.005, unfilt: true },
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat (beta)', cw: 4096, cp: 0.000234533, cc: 0.000234533, unfilt: true },
'alpindale/goliath-120b': { name: 'Goliath 120B (beta)', cw: 6144, cp: 0.00703125, cc: 0.00703125, unfilt: true },
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B (beta)', cw: 4096, cp: 0.000562, cc: 0.000762, unfilt: true },
'nousresearch/nous-capybara-7b': { name: 'Nous: Capybara 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct', cw: 8192, cp: 0, cc: 0, unfilt: true },
'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32000, cp: 0, cc: 0, unfilt: true },
'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32000, cp: 0, cc: 0, unfilt: true },
'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
'openai/gpt-3.5-turbo': { name: 'OpenAI: GPT-3.5 Turbo', cw: 4095, cp: 0.001, cc: 0.002, unfilt: false },
'openai/gpt-3.5-turbo-1106': { name: 'OpenAI: GPT-3.5 Turbo 16k (preview)', cw: 16385, cp: 0.001, cc: 0.002, unfilt: false },
'openai/gpt-3.5-turbo-16k': { name: 'OpenAI: GPT-3.5 Turbo 16k', cw: 16385, cp: 0.003, cc: 0.004, unfilt: false },
@@ -272,24 +278,38 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B (beta)', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B (beta)', cw: 32000, cp: 0.02, cc: 0.02, unfilt: true },
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'migtissera/synthia-70b': { name: 'Synthia 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B (beta)', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B (beta)', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'neversleep/noromaid-20b': { name: 'Noromaid 20B (beta)', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
'perplexity/pplx-70b-chat': { name: 'Perplexity: PPLX 70B Chat', cw: 4096, cp: 0.0007, cc: 0.0028, unfilt: true },
'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
'undi95/toppy-m-7b': { name: 'Toppy M 7B', cw: 32768, cp: 0.000375, cc: 0.000375, unfilt: true },
'alpindale/goliath-120b': { name: 'Goliath 120B', cw: 6144, cp: 0.009375, cc: 0.009375, unfilt: true },
'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
'neversleep/noromaid-20b': { name: 'Noromaid 20B', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
'01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
'01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
'01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32000, cp: 0.0006, cc: 0.0006, unfilt: true },
'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.0045, cc: 0.0045, unfilt: true },
'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.003375, cc: 0.003375, unfilt: true },
'gryphe/mythomax-l2-13b': { name: 'MythoMax 13B', cw: 4096, cp: 0.0006, cc: 0.0006, unfilt: true },
// Old models (maintained for reference)
'openai/gpt-3.5-turbo-0301': { name: 'OpenAI: GPT-3.5 Turbo (older v0301)', cw: 4095, cp: 0.001, cc: 0.002, old: true },
'openai/gpt-4-0314': { name: 'OpenAI: GPT-4 (older v0314)', cw: 8191, cp: 0.03, cc: 0.06, old: true },
'openai/gpt-4-32k-0314': { name: 'OpenAI: GPT-4 32k (older v0314)', cw: 32767, cp: 0.06, cc: 0.12, old: true },
@@ -301,7 +321,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
'anthropic/claude-instant-1.0': { name: 'Anthropic: Claude Instant (older v1)', cw: 9000, cp: 0.00163, cc: 0.00551, old: true },
};
const orModelFamilyOrder = ['mistralai/', 'huggingfaceh4/', 'undi95/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/', 'openrouter/'];
const orModelFamilyOrder = [
// great models
'mistralai/mixtral-8x7b-instruct', 'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
// great orgs
'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/',
];
export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
const aPrefixIndex = orModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
@@ -321,10 +346,10 @@ export function openRouterModelToModelDescription(modelId: string, created: numb
const orModel = orModelMap[modelId] ?? null;
let label = orModel?.name || modelId.replace('/', ' · ');
if (orModel?.cp === 0 && orModel?.cc === 0)
label += ' - 🎁 Free';
label += ' · 🎁 Free';
// if (!orModel)
// console.log('openRouterModelToModelDescription: unknown model id:', modelId);
if (SERVER_DEBUG_WIRE && !orModel)
console.log(' - openRouterModelToModelDescription: non-mapped model id:', modelId);
// context: use the known size if available, otherwise fallback to the (undocumneted) provided length or fallback again to 4096
const contextWindow = orModel?.cw || context_length || 4096;
@@ -6,10 +6,10 @@ import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, S
import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
import type { OpenAIWire } from './openai.wiretypes';
import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
import { ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';
/**
@@ -59,10 +59,10 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
break;
case 'ollama':
headersUrl = ollamaAccess(access, '/api/generate');
headersUrl = ollamaAccess(access, OLLAMA_PATH_CHAT);
body = ollamaChatCompletionPayload(model, history, true);
eventStreamFormat = 'json-nl';
vendorStreamParser = createOllamaStreamParser();
vendorStreamParser = createOllamaChatCompletionStreamParser();
break;
case 'azure':
@@ -135,30 +135,39 @@ function createAnthropicStreamParser(): AIStreamParser {
};
}
function createOllamaStreamParser(): AIStreamParser {
function createOllamaChatCompletionStreamParser(): AIStreamParser {
let hasBegun = false;
return (data: string) => {
let wireGeneration: any;
// parse the JSON chunk
let wireJsonChunk: any;
try {
wireGeneration = JSON.parse(data);
wireJsonChunk = JSON.parse(data);
} catch (error: any) {
// log the malformed data to the console, and rethrow to transmit as 'error'
console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
throw error;
}
const generation = wireOllamaGenerationSchema.parse(wireGeneration);
let text = generation.response;
// validate chunk
const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
// pass through errors from Ollama
if ('error' in chunk)
throw new Error(chunk.error);
// process output
let text = chunk.message?.content || /*chunk.response ||*/ '';
// hack: prepend the model name to the first packet
if (!hasBegun) {
if (!hasBegun && chunk.model) {
hasBegun = true;
const firstPacket: ChatStreamFirstPacketSchema = { model: generation.model };
const firstPacket: ChatStreamFirstPacketSchema = { model: chunk.model };
text = JSON.stringify(firstPacket) + text;
}
return { text, close: generation.done };
return { text, close: chunk.done };
};
}
@@ -248,8 +257,9 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
if (close)
controller.terminate();
} catch (error: any) {
// console.log(`/api/llms/stream: parse issue: ${error?.message || error}`);
controller.enqueue(textEncoder.encode(`[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}`));
if (SERVER_DEBUG_WIRE)
console.log(' - E: parse issue:', event.data, error?.message || error);
controller.enqueue(textEncoder.encode(` **[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}**`));
controller.terminate();
}
};
@@ -1,24 +1,29 @@
import * as React from 'react';
import { Box, Button, Chip, FormControl, Input, Option, Select, Stack, Typography } from '@mui/joy';
import { Box, Button, Chip, FormControl, IconButton, Input, Option, Select, Stack, Typography } from '@mui/joy';
import LaunchIcon from '@mui/icons-material/Launch';
import FormatListNumberedRtlIcon from '@mui/icons-material/FormatListNumberedRtl';
import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
import { GoodModal } from '~/common/components/GoodModal';
import { GoodTooltip } from '~/common/components/GoodTooltip';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { apiQuery } from '~/common/util/trpc.client';
import { settingsGap } from '~/common/app.theme';
import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
import { InlineError } from '~/common/components/InlineError';
export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () => void }) {
export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {
// state
const [sortByPulls, setSortByPulls] = React.useState<boolean>(false);
const [modelName, setModelName] = React.useState<string | null>('llama2');
const [modelTag, setModelTag] = React.useState<string>('');
// external state
const { data: pullable } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
const { data: pullableData } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
staleTime: 1000 * 60,
refetchOnWindowFocus: false,
});
@@ -26,7 +31,11 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
const { isLoading: isDeleting, status: deleteStatus, error: deleteError, mutate: deleteMutate, reset: deleteReset } = apiQuery.llmOllama.adminDelete.useMutation();
// derived state
const pullModelDescription = pullable?.pullable.find(p => p.id === modelName)?.description ?? null;
let pullable = pullableData?.pullable || [];
if (sortByPulls)
pullable = pullable.toSorted((a, b) => b.pulls - a.pulls);
const pullModelDescription = pullable.find(p => p.id === modelName)?.description ?? null;
const handleModelPull = () => {
deleteReset();
@@ -38,6 +47,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
modelName && deleteMutate({ access: props.access, name: modelName + (modelTag ? ':' + modelTag : '') });
};
return (
<GoodModal title='Ollama Administration' dividers open onClose={props.onClose}>
@@ -47,25 +57,48 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
However we provide a way to pull models from the Ollama host, for convenience.
</Typography>
<Box sx={{ display: 'flex', gap: 1 }}>
<FormControl sx={{ flexGrow: 1 }}>
<Box sx={{ display: 'flex', flexFlow: 'row wrap', gap: 1 }}>
<FormControl sx={{ flexGrow: 1, flexBasis: 0.55 }}>
<FormLabelStart title='Name' />
<Select value={modelName || ''} onChange={(_event: any, value: string | null) => setModelName(value)}>
{pullable?.pullable.map(p =>
<Option key={p.id} value={p.id}>
{p.isNew === true && <Chip size='sm' variant='outlined'>New</Chip>} {p.label}
</Option>,
)}
</Select>
<Box sx={{ display: 'flex', gap: 1 }}>
<Select
value={modelName || ''}
onChange={(_event: any, value: string | null) => setModelName(value)}
sx={{ flexGrow: 1 }}
>
{pullable.map(p =>
<Option key={p.id} value={p.id}>
{p.isNew === true && <Chip size='sm' variant='outlined'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
</Option>,
)}
</Select>
<GoodTooltip title='Sort by Downloads'>
<IconButton
variant={sortByPulls ? 'solid' : 'outlined'}
onClick={() => setSortByPulls(!sortByPulls)}
>
<FormatListNumberedRtlIcon />
</IconButton>
</GoodTooltip>
</Box>
</FormControl>
<FormControl sx={{ flexGrow: 1 }}>
<FormControl sx={{ flexGrow: 1, flexBasis: 0.45 }}>
<FormLabelStart title='Tag' />
<Input
variant='outlined' placeholder='latest'
value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
sx={{ minWidth: 100 }}
slotProps={{ input: { size: 10 } }} // halve the min width
/>
<Box sx={{ display: 'flex', gap: 1 }}>
<Input
variant='outlined' placeholder='latest'
value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
sx={{ minWidth: 80, flexGrow: 1 }}
slotProps={{ input: { size: 10 } }} // halve the min width
/>
{!!modelName && (
<IconButton
component={Link} href={`https://ollama.ai/library/${modelName}`} target='_blank'
>
<LaunchIcon />
</IconButton>
)}
</Box>
</FormControl>
</Box>
@@ -85,7 +118,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
{pullModelDescription}
</Typography>
<Box sx={{ display: 'flex', gap: 1 }}>
<Box sx={{ display: 'flex', flexWrap: 1, gap: 1 }}>
<Button
variant='outlined'
color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
+2 -2
View File
@@ -11,7 +11,7 @@ import { asValidURL } from '~/common/util/urlUtils';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { ModelVendorOllama } from './ollama.vendor';
import { OllamaAdmin } from './OllamaAdmin';
import { OllamaAdministration } from './OllamaAdministration';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
@@ -63,7 +63,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {
{isError && <InlineError error={error} />}
{adminOpen && <OllamaAdmin access={access} onClose={() => setAdminOpen(false)} />}
{adminOpen && <OllamaAdministration access={access} onClose={() => setAdminOpen(false)} />}
</>;
}
@@ -1,12 +1,13 @@
import * as React from 'react';
import { Typography } from '@mui/joy';
import { Button, Typography } from '@mui/joy';
import { FormInputKey } from '~/common/components/forms/FormInputKey';
import { InlineError } from '~/common/components/InlineError';
import { Link } from '~/common/components/Link';
import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
import { apiQuery } from '~/common/util/trpc.client';
import { getCallbackUrl } from '~/common/app.routes';
import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
@@ -38,6 +39,16 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
staleTime: Infinity,
});
const handleOpenRouterLogin = () => {
// replace the current page with the OAuth page
const callbackUrl = getCallbackUrl('openrouter');
const oauthUrl = 'https://openrouter.ai/auth?callback_url=' + encodeURIComponent(callbackUrl);
window.open(oauthUrl, '_self');
// ...bye / see you soon at the callback location...
};
return <>
{/*<Box sx={{ display: 'flex', gap: 1, alignItems: 'center' }}>*/}
@@ -53,7 +64,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
<FormInputKey
id='openrouter-key' label='OpenRouter API Key'
rightLabel={<>{needsUserKey
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>create key</Link>
? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
: '✔️ already set in server'
} {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
</>}
@@ -62,7 +73,14 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
placeholder='sk-or-...'
/>
<SetupFormRefetchButton refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError} />
<SetupFormRefetchButton
refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
leftButton={
<Button color='neutral' variant={(needsUserKey && !keyValid) ? 'solid' : 'outlined'} onClick={handleOpenRouterLogin}>
OpenRouter Login
</Button>
}
/>
{isError && <InlineError error={error} />}