OpenRouter: OAuth login support

1.7.2: Stable Patch Version
OpenRouter: update models
2026-05-10 21:50:14 -07:00 · 2023-12-11 22:35:40 -08:00 · 2023-12-11 21:22:31 -08:00 · 2023-12-11 21:21:22 -08:00 · 2023-12-11 20:46:34 -08:00 · 2023-12-11 18:22:15 -08:00
21 changed files with 539 additions and 167 deletions
@@ -21,7 +21,7 @@ shows the current developments and future ideas.
 - Got a suggestion? [_Add your roadmap ideas_](https://github.com/enricoros/big-agi/issues/new?&template=roadmap-request.md)
 - Want to contribute? [_Pick up a task!_](https://github.com/users/enricoros/projects/4/views/4) - _easy_ to _pro_

-### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
+### What's New in 1.7.2 · Dec 12, 2023 · Attachment Theory 🌟

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -31,6 +31,8 @@ shows the current developments and future ideas.
 - Optimized Voice Input and Performance
 - Latest Ollama and Oobabooga models
 - For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
+- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
+- [1.7.2]: Updated OpenRouter models (incl. Mixtral 8x7B)

 ### What's New in 1.6.0 - Nov 28, 2023

@@ -10,7 +10,7 @@ by release.
 - work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
 - milestone: [1.8.0](https://github.com/enricoros/big-agi/milestone/8)

-### What's New in 1.7.0 · Dec 10, 2023 · Attachment Theory 🌟
+### What's New in 1.7.2 · Dec 11, 2023 · Attachment Theory 🌟

 - **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
 - **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
@@ -20,6 +20,8 @@ by release.
 - Optimized Voice Input and Performance
 - Latest Ollama and Oobabooga models
 - For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
+- [1.7.1]: Improved Ollama chats. [#270](https://github.com/enricoros/big-agi/issues/270)
+- [1.7.2]: Updated OpenRouter models (incl. Mixtral 8x7B)

 ### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up

@@ -5,15 +5,20 @@ This guide helps you connect [Ollama](https://ollama.ai) [models](https://ollama
 experience. The integration brings the popular big-AGI features to Ollama, including: voice chats,
 editing tools, models switching, personas, and more.

+_Last updated Dec 11, 2023_
+
 ![config-local-ollama-0-example.png](pixels/config-ollama-0-example.png)

 ## Quick Integration Guide

-1. **Ensure Ollama API Server is Running**: Before starting, make sure your Ollama API server is up and running.
-2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**.
-3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`).
-4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models.
-5. **Start Using AI Personas**: Select an Ollama model and begin interacting with AI personas tailored to your needs.
+1. **Ensure Ollama API Server is Running**: Follow the official instructions to get Ollama up and running on your machine
+2. **Add Ollama as a Model Source**: In `big-AGI`, navigate to the **Models** section, select **Add a model source**, and choose **Ollama**
+3. **Enter Ollama Host URL**: Provide the Ollama Host URL where the API server is accessible (e.g., `http://localhost:11434`)
+4. **Refresh Model List**: Once connected, refresh the list of available models to include the Ollama models
+   > Optional: use the Ollama Admin interface to see which models are available and 'Pull' them in your local machine. Note
+   that this operation will likely timeout due to Edge Functions timeout on the big-AGI server while pulling, and
+   you'll have to press the 'Pull' button again, until a green message appears.
+5. **Chat with Ollama models**: select an Ollama model and begin chatting with AI personas

 ### Ollama: installation and Setup

@@ -1,12 +1,12 @@
 {
  "name": "big-agi",
-  "version": "1.7.0",
+  "version": "1.7.2",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "big-agi",
-      "version": "1.7.0",
+      "version": "1.7.2",
      "hasInstallScript": true,
      "dependencies": {
        "@dqbd/tiktoken": "^1.0.7",
@@ -1,6 +1,6 @@
 {
  "name": "big-agi",
-  "version": "1.7.0",
+  "version": "1.7.2",
  "private": true,
  "scripts": {
    "dev": "next dev",
@@ -0,0 +1,98 @@
+import * as React from 'react';
+import { useRouter } from 'next/router';
+
+import { Box, Typography } from '@mui/joy';
+
+import { useModelsStore } from '~/modules/llms/store-llms';
+
+import { AppLayout } from '~/common/layout/AppLayout';
+import { InlineError } from '~/common/components/InlineError';
+import { apiQuery } from '~/common/util/trpc.client';
+import { navigateToIndex } from '~/common/app.routes';
+import { openLayoutModelsSetup } from '~/common/layout/store-applayout';
+
+
+function CallbackOpenRouterPage(props: { openRouterCode: string | undefined }) {
+
+  // external state
+  const { data, isError, error, isLoading } = apiQuery.backend.exchangeOpenRouterKey.useQuery({ code: props.openRouterCode || '' }, {
+    enabled: !!props.openRouterCode,
+    refetchOnWindowFocus: false,
+    staleTime: Infinity,
+  });
+
+  // derived state
+  const isErrorInput = !props.openRouterCode;
+  const openRouterKey = data?.key ?? undefined;
+  const isSuccess = !!openRouterKey;
+
+
+  // Success: save the key and redirect to the chat app
+  React.useEffect(() => {
+    if (!isSuccess)
+      return;
+
+    // 1. Save the key as the client key
+    useModelsStore.getState().setOpenRoutersKey(openRouterKey);
+
+    // 2. Navigate to the chat app
+    navigateToIndex(true).then(() => openLayoutModelsSetup());
+
+  }, [isSuccess, openRouterKey]);
+
+  return (
+    <Box sx={{
+      flexGrow: 1,
+      backgroundColor: 'background.level1',
+      overflowY: 'auto',
+      display: 'flex', justifyContent: 'center',
+      p: { xs: 3, md: 6 },
+    }}>
+
+      <Box sx={{
+        // my: 'auto',
+        display: 'flex', flexDirection: 'column', alignItems: 'center',
+        gap: 4,
+      }}>
+
+        <Typography level='title-lg'>
+          Welcome Back
+        </Typography>
+
+        {isLoading && <Typography level='body-sm'>Loading...</Typography>}
+
+        {isErrorInput && <InlineError error='There was an issue retrieving the code from OpenRouter.' />}
+
+        {isError && <InlineError error={error} />}
+
+        {data && (
+          <Typography level='body-md'>
+            Success! You can now close this window.
+          </Typography>
+        )}
+
+      </Box>
+
+    </Box>
+  );
+}
+
+
+/**
+ * This page will be invoked by OpenRouter as a Callback
+ *
+ * Docs: https://openrouter.ai/docs#oauth
+ * Example URL: https://localhost:3000/link/callback_openrouter?code=SomeCode
+ */
+export default function Page() {
+
+  // get the 'code=...' from the URL
+  const { query } = useRouter();
+  const { code: openRouterCode } = query;
+
+  return (
+    <AppLayout suspendAutoModelsSetup>
+      <CallbackOpenRouterPage openRouterCode={openRouterCode as (string | undefined)} />
+    </AppLayout>
+  );
+}
@@ -254,7 +254,7 @@ export async function attachmentPerformConversion(attachment: Readonly<Attachmen
    case 'rich-text-table':
      let mdTable: string;
      try {
-        mdTable = htmlTableToMarkdown(input.altData!);
+        mdTable = htmlTableToMarkdown(input.altData!, false);
      } catch (error) {
        // fallback to text/plain
        mdTable = inputDataToString(input.data);
@@ -67,9 +67,10 @@ export const NewsItems: NewsItem[] = [
    ],
  },*/
  {
-    versionCode: '1.7.0',
+    versionCode: '1.7.2',
    versionName: 'Attachment Theory',
-    versionDate: new Date('2023-12-10T12:00:00Z'), // new Date().toISOString()
+    versionDate: new Date('2023-12-11T06:00:00Z'), // new Date().toISOString()
+    // versionDate: new Date('2023-12-10T12:00:00Z'), // 1.7.0
    items: [
      { text: <>Redesigned <B href={RIssues + '/251'}>attachments system</B>: drag, paste, link, snap, images, text, pdfs</> },
      { text: <>Desktop <B href={RIssues + '/253'}>webcam access</B> for direct image capture (Labs option)</> },
@@ -79,6 +80,8 @@ export const NewsItems: NewsItem[] = [
      { text: <>{platformAwareKeystrokes('Ctrl+Shift+O')}: quick access to model options</> },
      { text: <>Optimized voice input and performance</> },
      { text: <>Latest Ollama and Oobabooga models</> },
+      { text: <>1.7.1: Improved <B href={RIssues + '/270'}>Ollama chats</B></> },
+      { text: <>1.7.2: Updated OpenRouter models</> },
    ],
  },
  {
@@ -13,9 +13,22 @@ export const ROUTE_INDEX = '/';
 export const ROUTE_APP_CHAT = '/';
 export const ROUTE_APP_LINK_CHAT = '/link/chat/:linkId';
 export const ROUTE_APP_NEWS = '/news';
+const ROUTE_CALLBACK_OPENROUTER = '/link/callback_openrouter';

 export const getIndexLink = () => ROUTE_INDEX;

+export const getCallbackUrl = (source: 'openrouter') => {
+  const callbackUrl = new URL(window.location.href);
+  switch (source) {
+    case 'openrouter':
+      callbackUrl.pathname = ROUTE_CALLBACK_OPENROUTER;
+      break;
+    default:
+      throw new Error(`Unknown source: ${source}`);
+  }
+  return callbackUrl.toString();
+};
+
 export const getChatLinkRelativePath = (chatLinkId: string) => ROUTE_APP_LINK_CHAT.replace(':linkId', chatLinkId);

 const navigateFn = (path: string) => (replace?: boolean): Promise<boolean> =>
@@ -46,6 +46,7 @@ export const appTheme = extendTheme({
        text: {
          icon: 'var(--joy-palette-neutral-700)',           // <IconButton color='neutral' /> icon color
          secondary: 'var(--joy-palette-neutral-800)',      // increase contrast a bit
+          // tertiary: 'var(--joy-palette-neutral-700)',       // increase contrast a bit
        },
        // popup [white] > surface [50] > level1 [100] > level2 [200] > level3 [300] > body [white -> 400]
        background: {
@@ -2,11 +2,13 @@
 * @fileoverview Utility functions for Markdown.
 */

+import { isBrowser } from '~/common/util/pwaUtils';
+
 /**
 * Quick and dirty conversion of HTML tables to Markdown tables.
 * Big plus: doesn't require any dependencies.
 */
-export function htmlTableToMarkdown(html: string): string {
+export function htmlTableToMarkdown(html: string, includeInvisible: boolean): string {
  const parser = new DOMParser();
  const doc = parser.parseFromString(html, 'text/html');
  const table = doc.querySelector('table');
@@ -16,20 +18,53 @@ export function htmlTableToMarkdown(html: string): string {
  const headerCells = table.querySelectorAll('thead th');
  if (headerCells.length > 0) {
    const headerRow = '| ' + Array.from(headerCells)
-      .map(cell => cell.textContent?.trim() || '')
-      .join(' | ') + '| ';
+      .map(cell => getTextWithSpaces(cell, includeInvisible).trim())
+      .join(' | ') + ' |';
    markdownRows.push(headerRow);
-    markdownRows.push('|:' + Array(headerCells.length).fill('-').join('|:') + '|');
+    markdownRows.push('|:' + Array(headerCells.length).fill('---').join('|:') + '|');
  }

  const bodyRows = table.querySelectorAll('tbody tr');
  for (const row of Array.from(bodyRows)) {
    const rowCells = row.querySelectorAll('td');
    const markdownRow = '| ' + Array.from(rowCells)
-      .map(cell => cell.textContent?.trim() || '')
+      .map(cell => getTextWithSpaces(cell, includeInvisible).trim())
      .join(' | ') + ' |';
    markdownRows.push(markdownRow);
  }

  return markdownRows.join('\n');
+}
+
+// Helper function to get text with spaces, ignoring hidden elements
+function getTextWithSpaces(node: Node, includeInvisible: boolean): string {
+  let text = '';
+  node.childNodes.forEach(child => {
+    if (child.nodeType === Node.TEXT_NODE)
+      text += child.textContent;
+    else if (child.nodeType === Node.ELEMENT_NODE)
+      if (includeInvisible || isVisible(child as Element))
+        text += ' ' + getTextWithSpaces(child, includeInvisible) + ' ';
+  });
+  return text;
+}
+
+// Helper function to determine if an element is visible
+function isVisible(element: Element): boolean {
+  if (!isBrowser) return true;
+
+  // if the cell is hidden, don't include it
+  const style = window.getComputedStyle(element);
+  if (style.display === 'none' || style.visibility === 'hidden')
+    return false;
+
+  // Check for common classes used to hide content or indicate tooltip/popover content.
+  // You may need to add more classes here based on your actual HTML/CSS.
+  const ignoredClasses = ['hidden', 'group-hover', 'tooltip', 'pointer-events-none', 'opacity-0'];
+  for (const ignoredClass of ignoredClasses)
+    if (element.classList.contains(ignoredClass))
+      return false;
+
+  // Otherwise, the element is considered visible
+  return true;
 }
@@ -1,5 +1,8 @@
+import { z } from 'zod';
+
 import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
+import { fetchJsonOrTRPCError } from '~/server/api/trpc.serverutils';

 import { analyticsListCapabilities } from './backend.analytics';

@@ -30,4 +33,17 @@ export const backendRouter = createTRPCRouter({
      };
    }),

+
+  // The following are used for various OAuth integrations
+
+  /* Exchange the OpenrRouter 'code' (from PKCS) for an OpenRouter API Key */
+  exchangeOpenRouterKey: publicProcedure
+    .input(z.object({ code: z.string() }))
+    .query(async ({ ctx, input }) => {
+      // Documented here: https://openrouter.ai/docs#oauth
+      return await fetchJsonOrTRPCError<{ key: string }, { code: string }>('https://openrouter.ai/api/v1/auth/keys', 'POST', {}, {
+        code: input.code,
+      }, 'Backend.exchangeOpenRouterKey');
+    }),
+
 });
@@ -2,7 +2,8 @@ import { create } from 'zustand';
 import { shallow } from 'zustand/shallow';
 import { persist } from 'zustand/middleware';

-import { ModelVendorId } from './vendors/IModelVendor';
+import type { ModelVendorId } from './vendors/IModelVendor';
+import type { SourceSetupOpenRouter } from './vendors/openrouter/openrouter.vendor';


 /**
@@ -76,6 +77,9 @@ interface ModelsActions {
  setChatLLMId: (id: DLLMId | null) => void;
  setFastLLMId: (id: DLLMId | null) => void;
  setFuncLLMId: (id: DLLMId | null) => void;
+
+  // special
+  setOpenRoutersKey: (key: string) => void;
 }

 type LlmsStore = ModelsData & ModelsActions;
@@ -162,13 +166,22 @@ export const useModelsStore = create<LlmsStore>()(
        set(state => ({
          sources: state.sources.map((source: DModelSource): DModelSource =>
            source.id === id
-              ? {
-                ...source,
-                setup: { ...source.setup, ...partialSetup },
-              } : source,
+              ? { ...source, setup: { ...source.setup, ...partialSetup } }
+              : source,
          ),
        })),

+      setOpenRoutersKey: (key: string) =>
+        set(state => {
+          const openRouterSource = state.sources.find(source => source.vId === 'openrouter');
+          if (!openRouterSource) return state;
+          return {
+            sources: state.sources.map(source => source.id === openRouterSource.id
+              ? { ...source, setup: { ...source.setup, oaiKey: key satisfies SourceSetupOpenRouter['oaiKey'] } }
+              : source),
+          };
+        }),
+
    }),
    {
      name: 'app-models',
@@ -3,54 +3,57 @@
 * descriptions for the models.
 * (nor does it reliably provide context window sizes) - TODO: open a bug upstream
 *
- * from: https://ollama.ai/library?sort=popular
+ * from: https://ollama.ai/library?sort=featured
 */
 export const OLLAMA_BASE_MODELS: { [key: string]: { description: string, pulls: number, added?: string } } = {
-  'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 56100 },
-  'llama2': { description: 'The most popular model for general use.', pulls: 117400 },
-  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 61500 },
-  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 26800 },
-  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 23000 },
-  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 20600 },
-  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 12100 },
-  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 9760 },
-  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9002 },
-  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 8671 },
-  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8478 },
-  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 8142 },
-  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7426 },
-  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7035 },
-  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6140 },
-  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 5865 },
-  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5534 },
-  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 4696 },
-  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4275 },
-  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4227 },
-  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 3663, added: '20231129' },
-  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3343 },
-  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 3134, added: '20231129' },
-  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3050 },
-  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 2981 },
-  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 2636 },
-  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2328 },
-  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 2281, added: '20231129' },
-  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 2101 },
-  'yi': { description: 'A high-performing, bilingual base model.', pulls: 1806 },
-  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 1803 },
-  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1605 },
-  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks.', pulls: 1584 },
-  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1560 },
-  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 1338, added: '20231129' },
-  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1253 },
-  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1163 },
-  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1099 },
-  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1042 },
-  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 728, added: '20231129' },
-  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 593 },
-  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 585 },
-  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 573, added: '20231129' },
-  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 446, added: '20231129' },
-  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 100, added: '20231129' },
-  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 11, added: '20231129' },
+  'starling-lm': { description: 'Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.', pulls: 2353, added: '20231129' },
+  'neural-chat': { description: 'A fine-tuned model based on Mistral with good coverage of domain and language.', pulls: 3089, added: '20231129' },
+  'mistral': { description: 'The Mistral 7B model released by Mistral AI', pulls: 70300 },
+  'yi': { description: 'A high-performing, bilingual base model.', pulls: 2673 },
+  'llama2': { description: 'The most popular model for general use.', pulls: 141000 },
+  'codellama': { description: 'A large language model that can use text prompts to generate and discuss code.', pulls: 71400 },
+  'llama2-uncensored': { description: 'Uncensored Llama 2 model by George Sung and Jarrad Hope.', pulls: 30900 },
+  'orca-mini': { description: 'A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.', pulls: 26000 },
+  'vicuna': { description: 'General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.', pulls: 21800 },
+  'wizard-vicuna-uncensored': { description: 'Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.', pulls: 13700 },
+  'phind-codellama': { description: 'Code generation model based on CodeLlama.', pulls: 10600 },
+  'zephyr': { description: 'Zephyr beta is a fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets.', pulls: 10200 },
+  'wizardcoder': { description: 'Llama based code generation model focused on Python.', pulls: 9895 },
+  'mistral-openorca': { description: 'Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.', pulls: 9256 },
+  'nous-hermes': { description: 'General use models based on Llama and Llama 2 from Nous Research.', pulls: 8827 },
+  'wizard-math': { description: 'Model focused on math and logic problems', pulls: 7849 },
+  'llama2-chinese': { description: 'Llama 2 based model fine tuned to improve Chinese dialogue ability.', pulls: 7375 },
+  'deepseek-coder': { description: 'DeepSeek Coder is trained from scratch on both 87% code and 13% natural language in English and Chinese. Each of the models are pre-trained on 2 trillion tokens.', pulls: 7335, added: '20231129' },
+  'falcon': { description: 'A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.', pulls: 6726 },
+  'stable-beluga': { description: 'Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.', pulls: 6272 },
+  'codeup': { description: 'Great code generation model based on Llama2.', pulls: 5978 },
+  'orca2': { description: 'Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta\'s Llama 2 models. The model is designed to excel particularly in reasoning.', pulls: 5854, added: '20231129' },
+  'everythinglm': { description: 'Uncensored Llama2 based model with 16k context size.', pulls: 5040 },
+  'medllama2': { description: 'Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.', pulls: 4648 },
+  'wizardlm-uncensored': { description: 'Uncensored version of Wizard LM model.', pulls: 4536 },
+  'dolphin2.2-mistral': { description: 'An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.', pulls: 3638 },
+  'starcoder': { description: 'StarCoder is a code generation model trained on 80+ programming languages.', pulls: 3638 },
+  'wizard-vicuna': { description: 'Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.', pulls: 3485 },
+  'openchat': { description: 'A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks.', pulls: 3438, added: '20231129' },
+  'open-orca-platypus2': { description: 'Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.', pulls: 3145 },
+  'openhermes2.5-mistral': { description: 'OpenHermes 2.5 Mistral 7B is a Mistral 7B fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.', pulls: 3023 },
+  'yarn-mistral': { description: 'An extension of Mistral to support a context of up to 128k tokens.', pulls: 2775 },
+  'samantha-mistral': { description: 'A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.', pulls: 2192 },
+  'sqlcoder': { description: 'SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks', pulls: 1973 },
+  'yarn-llama2': { description: 'An extension of Llama 2 that supports a context of up to 128k tokens.', pulls: 1915 },
+  'openhermes2-mistral': { description: 'OpenHermes 2 Mistral is a 7B model fine-tuned on Mistral with 900,000 entries of primarily GPT-4 generated data from open datasets.', pulls: 1690 },
+  'meditron': { description: 'Open-source medical large language model adapted from Llama 2 to the medical domain.', pulls: 1667, added: '20231129' },
+  'wizardlm': { description: 'General use 70 billion parameter model based on Llama 2.', pulls: 1379 },
+  'mistrallite': { description: 'MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.', pulls: 1345 },
+  'deepseek-llm': { description: 'An advanced language model crafted with 2 trillion bilingual tokens.', pulls: 1318, added: '20231129' },
+  'dolphin2.1-mistral': { description: 'An instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.', pulls: 1302 },
+  'codebooga': { description: 'A high-performing code instruct model created by merging two existing code models.', pulls: 1254 },
+  'goliath': { description: 'A language model created by combining two fine-tuned Llama 2 70B models into one.', pulls: 946, added: '20231129' },
+  'stablelm-zephyr': { description: 'A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.', pulls: 945, added: '20231210' },
+  'nexusraven': { description: 'Nexus Raven is a 13B instruction tuned model for function calling tasks.', pulls: 860 },
+  'magicoder': { description: '🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.', pulls: 816, added: '20231210' },
+  'alfred': { description: 'A robust conversational model designed to be used for both chat and instruct use cases.', pulls: 804, added: '20231129' },
+  'xwinlm': { description: 'Conversational model based on Llama 2 that performs competitively on various benchmarks.', pulls: 706 },
 };
-export const OLLAMA_LAST_UPDATE: string = '20231129';
+// export const OLLAMA_LAST_UPDATE: string = '20231210';
+export const OLLAMA_PREV_UPDATE: string = '20231129';
@@ -1,4 +1,5 @@
 import { z } from 'zod';
+import { TRPCError } from '@trpc/server';

 import { createTRPCRouter, publicProcedure } from '~/server/api/trpc.server';
 import { env } from '~/server/env.mjs';
@@ -11,12 +12,15 @@ import { capitalizeFirstLetter } from '~/common/util/textUtils';
 import { fixupHost, openAIChatGenerateOutputSchema, OpenAIHistorySchema, openAIHistorySchema, OpenAIModelSchema, openAIModelSchema } from '../openai/openai.router';
 import { listModelsOutputSchema, ModelDescriptionSchema } from '../server.schemas';

-import { OLLAMA_BASE_MODELS, OLLAMA_LAST_UPDATE } from './ollama.models';
-import { wireOllamaGenerationSchema } from './ollama.wiretypes';
+import { OLLAMA_BASE_MODELS, OLLAMA_PREV_UPDATE } from './ollama.models';
+import { WireOllamaChatCompletionInput, wireOllamaChunkedOutputSchema } from './ollama.wiretypes';


 // Default hosts
 const DEFAULT_OLLAMA_HOST = 'http://127.0.0.1:11434';
+export const OLLAMA_PATH_CHAT = '/api/chat';
+const OLLAMA_PATH_TAGS = '/api/tags';
+const OLLAMA_PATH_SHOW = '/api/show';


 // Mappers
@@ -34,7 +38,23 @@ export function ollamaAccess(access: OllamaAccessSchema, apiPath: string): { hea

 }

-export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {
+
+export const ollamaChatCompletionPayload = (model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean): WireOllamaChatCompletionInput => ({
+  model: model.id,
+  messages: history,
+  options: {
+    ...(model.temperature && { temperature: model.temperature }),
+  },
+  // n: ...
+  // functions: ...
+  // function_call: ...
+  stream,
+});
+
+
+/* Unused: switched to the Chat endpoint (above). The implementation is left here for reference.
+https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion
+export function ollamaCompletionPayload(model: OpenAIModelSchema, history: OpenAIHistorySchema, stream: boolean) {

  // if the first message is the system prompt, extract it
  let systemPrompt: string | undefined = undefined;
@@ -62,7 +82,7 @@ export function ollamaChatCompletionPayload(model: OpenAIModelSchema, history: O
    ...(systemPrompt && { system: systemPrompt }),
    stream,
  };
-}
+}*/

 async function ollamaGET<TOut extends object>(access: OllamaAccessSchema, apiPath: string /*, signal?: AbortSignal*/): Promise<TOut> {
  const { headers, url } = ollamaAccess(access, apiPath);
@@ -104,6 +124,7 @@ const listPullableOutputSchema = z.object({
    label: z.string(),
    tag: z.string(),
    description: z.string(),
+    pulls: z.number(),
    isNew: z.boolean(),
  })),
 });
@@ -122,7 +143,8 @@ export const llmOllamaRouter = createTRPCRouter({
          label: capitalizeFirstLetter(model_id),
          tag: 'latest',
          description: model.description,
-          isNew: !!model.added && model.added >= OLLAMA_LAST_UPDATE,
+          pulls: model.pulls,
+          isNew: !!model.added && model.added >= OLLAMA_PREV_UPDATE,
        })),
      };
    }),
@@ -160,6 +182,7 @@ export const llmOllamaRouter = createTRPCRouter({
        throw new Error('Ollama delete issue: ' + deleteOutput);
    }),

+
  /* Ollama: List the Models available */
  listModels: publicProcedure
    .input(accessOnlySchema)
@@ -167,7 +190,7 @@ export const llmOllamaRouter = createTRPCRouter({
    .query(async ({ input }) => {

      // get the models
-      const wireModels = await ollamaGET(input.access, '/api/tags');
+      const wireModels = await ollamaGET(input.access, OLLAMA_PATH_TAGS);
      const wireOllamaListModelsSchema = z.object({
        models: z.array(z.object({
          name: z.string(),
@@ -180,7 +203,7 @@ export const llmOllamaRouter = createTRPCRouter({

      // retrieve info for each of the models (/api/show, post call, in parallel)
      const detailedModels = await Promise.all(models.map(async model => {
-        const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, '/api/show');
+        const wireModelInfo = await ollamaPOST(input.access, { 'name': model.name }, OLLAMA_PATH_SHOW);
        const wireOllamaModelInfoSchema = z.object({
          license: z.string().optional(),
          modelfile: z.string(),
@@ -221,12 +244,24 @@ export const llmOllamaRouter = createTRPCRouter({
    .output(openAIChatGenerateOutputSchema)
    .mutation(async ({ input: { access, history, model } }) => {

-      const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), '/api/generate');
-      const generation = wireOllamaGenerationSchema.parse(wireGeneration);
+      const wireGeneration = await ollamaPOST(access, ollamaChatCompletionPayload(model, history, false), OLLAMA_PATH_CHAT);
+      const generation = wireOllamaChunkedOutputSchema.parse(wireGeneration);
+
+      if ('error' in generation)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Ollama chat-generation issue: ${generation.error}`,
+        });
+
+      if (!generation.message?.content)
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Ollama chat-generation API issue: ${JSON.stringify(wireGeneration)}`,
+        });

      return {
        role: 'assistant',
-        content: generation.response,
+        content: generation.message.content,
        finish_reason: generation.done ? 'stop' : null,
      };
    }),
@@ -1,16 +1,76 @@
 import { z } from 'zod';

-export const wireOllamaGenerationSchema = z.object({
-  model: z.string(),
-  // created_at: z.string(), // commented because unused
-  response: z.string(),
-  done: z.boolean(),

-  // only on the last message
-  // context: z.array(z.number()),
-  // total_duration: z.number(),
-  // load_duration: z.number(),
-  // eval_duration: z.number(),
-  // prompt_eval_count: z.number(),
-  // eval_count: z.number(),
+/**
+ * Chat Completion API - Request
+ * https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion
+ */
+const wireOllamaChatCompletionInputSchema = z.object({
+
+  // required
+  model: z.string(),
+  messages: z.array(z.object({
+    role: z.enum(['assistant', 'system', 'user']),
+    content: z.string(),
+  })),
+
+  // optional
+  format: z.enum(['json']).optional(),
+  options: z.object({
+    // https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md
+    // Maximum number of tokens to predict when generating text.
+    num_predict: z.number().int().optional(),
+    // Sets the random number seed to use for generation
+    seed: z.number().int().optional(),
+    // The temperature of the model
+    temperature: z.number().positive().optional(),
+    // Reduces the probability of generating nonsense (Default: 40)
+    top_k: z.number().positive().optional(),
+    // Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text. (Default 0.9)
+    top_p: z.number().positive().optional(),
+  }).optional(),
+  template: z.string().optional(), // overrides what is defined in the Modelfile
+  stream: z.boolean().optional(), // default: true
+
+  // Future Improvements?
+  // n: z.number().int().optional(), // number of completions to generate
+  // functions: ...
+  // function_call: ...
 });
+export type WireOllamaChatCompletionInput = z.infer<typeof wireOllamaChatCompletionInputSchema>;
+
+
+/**
+ * Chat Completion or Generation APIs - Streaming Response
+ */
+export const wireOllamaChunkedOutputSchema = z.union([
+  // Chat Completion Chunk
+  z.object({
+    model: z.string(),
+    // created_at: z.string(), // commented because unused
+
+    // [Chat Completion] (exclusive with 'response')
+    message: z.object({
+      role: z.enum(['assistant' /*, 'system', 'user' Disabled on purpose, to validate the response */]),
+      content: z.string(),
+    }).optional(), // optional on the last message
+
+    // [Generation] (non-chat, exclusive with 'message')
+    //response: z.string().optional(),
+
+    done: z.boolean(),
+
+    // only on the last message
+    // context: z.array(z.number()), // non-chat endpoint
+    // total_duration: z.number(),
+    // prompt_eval_count: z.number(),
+    // prompt_eval_duration: z.number(),
+    // eval_count: z.number(),
+    // eval_duration: z.number(),
+
+  }),
+  // Possible Error
+  z.object({
+    error: z.string(),
+  }),
+]);
@@ -1,5 +1,6 @@
 import type { ModelDescriptionSchema } from '../server.schemas';
 import { LLM_IF_OAI_Chat, LLM_IF_OAI_Complete, LLM_IF_OAI_Fn, LLM_IF_OAI_Vision } from '../../../store-llms';
+import { SERVER_DEBUG_WIRE } from '~/server/wire';


 // [Azure] / [OpenAI]
@@ -236,8 +237,8 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
 /**
 * Created to reflect the doc page: https://openrouter.ai/docs
 *
- * Update prompt:
- *   "Please update the typescript object below (do not change the definition, just the object), based on the updated upstream documentation:"
+ * Update prompt (last updated 2023-12-12)
+ *   "Please update the following typescript object (do not change the definition, just values, and do not miss any rows), based on the information provided thereafter:"
 *
 * fields:
 *  - cw: context window size (max tokens, total)
@@ -247,19 +248,24 @@ export function oobaboogaModelToModelDescription(modelId: string, created: numbe
 */
 const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?: number; old?: boolean; unfilt?: boolean; } } = {
  // 'openrouter/auto': { name: 'Auto (best for prompt)', cw: 128000, cp: undefined, cc: undefined, unfilt: undefined },
-  'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
-  'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B (beta)', cw: 4096, cp: 0, cc: 0, unfilt: true },
-  'openchat/openchat-7b': { name: 'OpenChat 7B (beta)', cw: 8192, cp: 0, cc: 0, unfilt: true },
-  'undi95/toppy-m-7b': { name: 'Toppy M 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
-  'gryphe/mythomist-7b': { name: 'MythoMist 7B (beta)', cw: 32768, cp: 0, cc: 0, unfilt: true },
-  'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B (beta)', cw: 4096, cp: 0.000155, cc: 0.000155, unfilt: true },
-  'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct (beta)', cw: 8192, cp: 0.00045, cc: 0.00045, unfilt: true },
-  'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2 (beta)', cw: 4096, cp: 0.00045, cc: 0.00045, unfilt: true },
-  'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1 (beta)', cw: 32768, cp: 0.005, cc: 0.005, unfilt: true },
-  'haotian-liu/llava-13b': { name: 'Llava 13B (beta)', cw: 2048, cp: 0.005, cc: 0.005, unfilt: true },
-  'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat (beta)', cw: 4096, cp: 0.000234533, cc: 0.000234533, unfilt: true },
-  'alpindale/goliath-120b': { name: 'Goliath 120B (beta)', cw: 6144, cp: 0.00703125, cc: 0.00703125, unfilt: true },
-  'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B (beta)', cw: 4096, cp: 0.000562, cc: 0.000762, unfilt: true },
+  'nousresearch/nous-capybara-7b': { name: 'Nous: Capybara 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
+  'mistralai/mistral-7b-instruct': { name: 'Mistral 7B Instruct', cw: 8192, cp: 0, cc: 0, unfilt: true },
+  'huggingfaceh4/zephyr-7b-beta': { name: 'Hugging Face: Zephyr 7B', cw: 4096, cp: 0, cc: 0, unfilt: true },
+  'openchat/openchat-7b': { name: 'OpenChat 3.5', cw: 8192, cp: 0, cc: 0, unfilt: true },
+  'gryphe/mythomist-7b': { name: 'MythoMist 7B', cw: 32768, cp: 0, cc: 0, unfilt: true },
+  'openrouter/cinematika-7b': { name: 'Cinematika 7B (alpha)', cw: 32000, cp: 0, cc: 0, unfilt: true },
+  'mistralai/mixtral-8x7b-instruct': { name: 'Mistral: Mixtral 8x7B Instruct (beta)', cw: 32000, cp: 0, cc: 0, unfilt: true },
+  'rwkv/rwkv-5-world-3b': { name: 'RWKV v5 World 3B (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
+  'recursal/rwkv-5-3b-ai-town': { name: 'RWKV v5 3B AI Town (beta)', cw: 10000, cp: 0, cc: 0, unfilt: true },
+  'jebcarter/psyfighter-13b': { name: 'Psyfighter 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
+  'koboldai/psyfighter-13b-2': { name: 'Psyfighter v2 13B', cw: 4096, cp: 0.001, cc: 0.001, unfilt: true },
+  'nousresearch/nous-hermes-llama2-13b': { name: 'Nous: Hermes 13B', cw: 4096, cp: 0.000075, cc: 0.000075, unfilt: true },
+  'meta-llama/codellama-34b-instruct': { name: 'Meta: CodeLlama 34B Instruct', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'phind/phind-codellama-34b': { name: 'Phind: CodeLlama 34B v2', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'intel/neural-chat-7b': { name: 'Neural Chat 7B v3.1', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'haotian-liu/llava-13b': { name: 'Llava 13B', cw: 2048, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'nousresearch/nous-hermes-2-vision-7b': { name: 'Nous: Hermes 2 Vision 7B (alpha)', cw: 4096, cp: 0.0025, cc: 0.0025, unfilt: true },
+  'meta-llama/llama-2-13b-chat': { name: 'Meta: Llama v2 13B Chat', cw: 4096, cp: 0.000156755, cc: 0.000156755, unfilt: true },
  'openai/gpt-3.5-turbo': { name: 'OpenAI: GPT-3.5 Turbo', cw: 4095, cp: 0.001, cc: 0.002, unfilt: false },
  'openai/gpt-3.5-turbo-1106': { name: 'OpenAI: GPT-3.5 Turbo 16k (preview)', cw: 16385, cp: 0.001, cc: 0.002, unfilt: false },
  'openai/gpt-3.5-turbo-16k': { name: 'OpenAI: GPT-3.5 Turbo 16k', cw: 16385, cp: 0.003, cc: 0.004, unfilt: false },
@@ -272,24 +278,38 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'google/palm-2-codechat-bison': { name: 'Google: PaLM 2 Code Chat', cw: 7168, cp: 0.0005, cc: 0.0005, unfilt: true },
  'google/palm-2-chat-bison-32k': { name: 'Google: PaLM 2 Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
  'google/palm-2-codechat-bison-32k': { name: 'Google: PaLM 2 Code Chat 32k', cw: 32000, cp: 0.0005, cc: 0.0005, unfilt: true },
-  'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
-  'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B (beta)', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
-  'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B (beta)', cw: 32000, cp: 0.02, cc: 0.02, unfilt: true },
-  'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B (beta)', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
-  'migtissera/synthia-70b': { name: 'Synthia 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
-  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B (beta)', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B (beta)', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
-  'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B (beta)', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B (beta)', cw: 8192, cp: 0.009375, cc: 0.009375, unfilt: true },
-  'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k (beta)', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
-  'neversleep/noromaid-20b': { name: 'Noromaid 20B (beta)', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
+  'perplexity/pplx-70b-online': { name: 'Perplexity: PPLX 70B Online', cw: 4096, cp: 0, cc: 0.0028, unfilt: true },
+  'perplexity/pplx-7b-online': { name: 'Perplexity: PPLX 7B Online', cw: 4096, cp: 0, cc: 0.00028, unfilt: true },
+  'perplexity/pplx-7b-chat': { name: 'Perplexity: PPLX 7B Chat', cw: 8192, cp: 0.00007, cc: 0.00028, unfilt: true },
+  'perplexity/pplx-70b-chat': { name: 'Perplexity: PPLX 70B Chat', cw: 4096, cp: 0.0007, cc: 0.0028, unfilt: true },
+  'meta-llama/llama-2-70b-chat': { name: 'Meta: Llama v2 70B Chat', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'nousresearch/nous-hermes-llama2-70b': { name: 'Nous: Hermes 70B', cw: 4096, cp: 0.0009, cc: 0.0009, unfilt: true },
+  'nousresearch/nous-capybara-34b': { name: 'Nous: Capybara 34B', cw: 32000, cp: 0.0007, cc: 0.0028, unfilt: true },
+  'jondurbin/airoboros-l2-70b': { name: 'Airoboros 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'migtissera/synthia-70b': { name: 'Synthia 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
+  'open-orca/mistral-7b-openorca': { name: 'Mistral OpenOrca 7B', cw: 8192, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'teknium/openhermes-2-mistral-7b': { name: 'OpenHermes 2 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'teknium/openhermes-2.5-mistral-7b': { name: 'OpenHermes 2.5 Mistral 7B', cw: 4096, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'pygmalionai/mythalion-13b': { name: 'Pygmalion: Mythalion 13B', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'undi95/remm-slerp-l2-13b': { name: 'ReMM SLERP 13B', cw: 6144, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'xwin-lm/xwin-lm-70b': { name: 'Xwin 70B', cw: 8192, cp: 0.00375, cc: 0.00375, unfilt: true },
+  'gryphe/mythomax-l2-13b-8k': { name: 'MythoMax 13B 8k', cw: 8192, cp: 0.001125, cc: 0.001125, unfilt: true },
+  'undi95/toppy-m-7b': { name: 'Toppy M 7B', cw: 32768, cp: 0.000375, cc: 0.000375, unfilt: true },
+  'alpindale/goliath-120b': { name: 'Goliath 120B', cw: 6144, cp: 0.009375, cc: 0.009375, unfilt: true },
+  'lizpreciatior/lzlv-70b-fp16-hf': { name: 'lzlv 70B', cw: 4096, cp: 0.0007, cc: 0.00095, unfilt: true },
+  'neversleep/noromaid-20b': { name: 'Noromaid 20B', cw: 8192, cp: 0.00225, cc: 0.00225, unfilt: true },
+  '01-ai/yi-34b-chat': { name: 'Yi 34B Chat', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
+  '01-ai/yi-34b': { name: 'Yi 34B (base)', cw: 4096, cp: 0.0008, cc: 0.0008, unfilt: true },
+  '01-ai/yi-6b': { name: 'Yi 6B (base)', cw: 4096, cp: 0.00014, cc: 0.00014, unfilt: true },
+  'togethercomputer/stripedhyena-nous-7b': { name: 'StripedHyena Nous 7B', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'togethercomputer/stripedhyena-hessian-7b': { name: 'StripedHyena Hessian 7B (base)', cw: 32000, cp: 0.0002, cc: 0.0002, unfilt: true },
+  'mistralai/mixtral-8x7b': { name: 'Mistral: Mixtral 8x7B (base) (beta)', cw: 32000, cp: 0.0006, cc: 0.0006, unfilt: true },
  'anthropic/claude-2': { name: 'Anthropic: Claude v2.1', cw: 200000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-2.0': { name: 'Anthropic: Claude v2.0', cw: 100000, cp: 0.008, cc: 0.024, unfilt: false },
  'anthropic/claude-instant-v1': { name: 'Anthropic: Claude Instant v1', cw: 100000, cp: 0.00163, cc: 0.00551, unfilt: false },
-  'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.0045, cc: 0.0045, unfilt: true },
+  'mancer/weaver': { name: 'Mancer: Weaver (alpha)', cw: 8000, cp: 0.003375, cc: 0.003375, unfilt: true },
  'gryphe/mythomax-l2-13b': { name: 'MythoMax 13B', cw: 4096, cp: 0.0006, cc: 0.0006, unfilt: true },
+  // Old models (maintained for reference)
  'openai/gpt-3.5-turbo-0301': { name: 'OpenAI: GPT-3.5 Turbo (older v0301)', cw: 4095, cp: 0.001, cc: 0.002, old: true },
  'openai/gpt-4-0314': { name: 'OpenAI: GPT-4 (older v0314)', cw: 8191, cp: 0.03, cc: 0.06, old: true },
  'openai/gpt-4-32k-0314': { name: 'OpenAI: GPT-4 32k (older v0314)', cw: 32767, cp: 0.06, cc: 0.12, old: true },
@@ -301,7 +321,12 @@ const orModelMap: { [id: string]: { name: string; cw: number; cp?: number; cc?:
  'anthropic/claude-instant-1.0': { name: 'Anthropic: Claude Instant (older v1)', cw: 9000, cp: 0.00163, cc: 0.00551, old: true },
 };

-const orModelFamilyOrder = ['mistralai/', 'huggingfaceh4/', 'undi95/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/', 'openrouter/'];
+const orModelFamilyOrder = [
+  // great models
+  'mistralai/mixtral-8x7b-instruct', 'mistralai/mistral-7b-instruct', 'nousresearch/nous-capybara-7b',
+  // great orgs
+  'huggingfaceh4/', 'openchat/', 'anthropic/', 'google/', 'openai/', 'meta-llama/', 'phind/',
+];

 export function openRouterModelFamilySortFn(a: { id: string }, b: { id: string }): number {
  const aPrefixIndex = orModelFamilyOrder.findIndex(prefix => a.id.startsWith(prefix));
@@ -321,10 +346,10 @@ export function openRouterModelToModelDescription(modelId: string, created: numb
  const orModel = orModelMap[modelId] ?? null;
  let label = orModel?.name || modelId.replace('/', ' · ');
  if (orModel?.cp === 0 && orModel?.cc === 0)
-    label += ' - 🎁 Free';
+    label += ' · 🎁 Free';

-  // if (!orModel)
-  //   console.log('openRouterModelToModelDescription: unknown model id:', modelId);
+  if (SERVER_DEBUG_WIRE && !orModel)
+    console.log(' - openRouterModelToModelDescription: non-mapped model id:', modelId);

  // context: use the known size if available, otherwise fallback to the (undocumneted) provided length or fallback again to 4096
  const contextWindow = orModel?.cw || context_length || 4096;
@@ -6,10 +6,10 @@ import { createEmptyReadableStream, debugGenerateCurlCommand, safeErrorString, S

 import type { AnthropicWire } from '../anthropic/anthropic.wiretypes';
 import type { OpenAIWire } from './openai.wiretypes';
+import { OLLAMA_PATH_CHAT, ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
 import { anthropicAccess, anthropicAccessSchema, anthropicChatCompletionPayload } from '../anthropic/anthropic.router';
-import { ollamaAccess, ollamaAccessSchema, ollamaChatCompletionPayload } from '../ollama/ollama.router';
 import { openAIAccess, openAIAccessSchema, openAIChatCompletionPayload, openAIHistorySchema, openAIModelSchema } from './openai.router';
-import { wireOllamaGenerationSchema } from '../ollama/ollama.wiretypes';
+import { wireOllamaChunkedOutputSchema } from '../ollama/ollama.wiretypes';


 /**
@@ -59,10 +59,10 @@ export async function openaiStreamingRelayHandler(req: NextRequest): Promise<Res
        break;

      case 'ollama':
-        headersUrl = ollamaAccess(access, '/api/generate');
+        headersUrl = ollamaAccess(access, OLLAMA_PATH_CHAT);
        body = ollamaChatCompletionPayload(model, history, true);
        eventStreamFormat = 'json-nl';
-        vendorStreamParser = createOllamaStreamParser();
+        vendorStreamParser = createOllamaChatCompletionStreamParser();
        break;

      case 'azure':
@@ -135,30 +135,39 @@ function createAnthropicStreamParser(): AIStreamParser {
  };
 }

-function createOllamaStreamParser(): AIStreamParser {
+function createOllamaChatCompletionStreamParser(): AIStreamParser {
  let hasBegun = false;

  return (data: string) => {

-    let wireGeneration: any;
+    // parse the JSON chunk
+    let wireJsonChunk: any;
    try {
-      wireGeneration = JSON.parse(data);
+      wireJsonChunk = JSON.parse(data);
    } catch (error: any) {
      // log the malformed data to the console, and rethrow to transmit as 'error'
      console.log(`/api/llms/stream: Ollama parsing issue: ${error?.message || error}`, data);
      throw error;
    }
-    const generation = wireOllamaGenerationSchema.parse(wireGeneration);
-    let text = generation.response;
+
+    // validate chunk
+    const chunk = wireOllamaChunkedOutputSchema.parse(wireJsonChunk);
+
+    // pass through errors from Ollama
+    if ('error' in chunk)
+      throw new Error(chunk.error);
+
+    // process output
+    let text = chunk.message?.content || /*chunk.response ||*/ '';

    // hack: prepend the model name to the first packet
-    if (!hasBegun) {
+    if (!hasBegun && chunk.model) {
      hasBegun = true;
-      const firstPacket: ChatStreamFirstPacketSchema = { model: generation.model };
+      const firstPacket: ChatStreamFirstPacketSchema = { model: chunk.model };
      text = JSON.stringify(firstPacket) + text;
    }

-    return { text, close: generation.done };
+    return { text, close: chunk.done };
  };
 }

@@ -248,8 +257,9 @@ function createEventStreamTransformer(vendorTextParser: AIStreamParser, inputFor
          if (close)
            controller.terminate();
        } catch (error: any) {
-          // console.log(`/api/llms/stream: parse issue: ${error?.message || error}`);
-          controller.enqueue(textEncoder.encode(`[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}`));
+          if (SERVER_DEBUG_WIRE)
+            console.log(' - E: parse issue:', event.data, error?.message || error);
+          controller.enqueue(textEncoder.encode(` **[Stream Issue] ${dialectLabel}: ${safeErrorString(error) || 'Unknown stream parsing error'}**`));
          controller.terminate();
        }
      };
@@ -1,24 +1,29 @@
 import * as React from 'react';

-import { Box, Button, Chip, FormControl, Input, Option, Select, Stack, Typography } from '@mui/joy';
+import { Box, Button, Chip, FormControl, IconButton, Input, Option, Select, Stack, Typography } from '@mui/joy';
+import LaunchIcon from '@mui/icons-material/Launch';
+import FormatListNumberedRtlIcon from '@mui/icons-material/FormatListNumberedRtl';

 import { FormLabelStart } from '~/common/components/forms/FormLabelStart';
 import { GoodModal } from '~/common/components/GoodModal';
+import { GoodTooltip } from '~/common/components/GoodTooltip';
+import { InlineError } from '~/common/components/InlineError';
+import { Link } from '~/common/components/Link';
 import { apiQuery } from '~/common/util/trpc.client';
 import { settingsGap } from '~/common/app.theme';

 import type { OllamaAccessSchema } from '../../transports/server/ollama/ollama.router';
-import { InlineError } from '~/common/components/InlineError';


-export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () => void }) {
+export function OllamaAdministration(props: { access: OllamaAccessSchema, onClose: () => void }) {

  // state
+  const [sortByPulls, setSortByPulls] = React.useState<boolean>(false);
  const [modelName, setModelName] = React.useState<string | null>('llama2');
  const [modelTag, setModelTag] = React.useState<string>('');

  // external state
-  const { data: pullable } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
+  const { data: pullableData } = apiQuery.llmOllama.adminListPullable.useQuery({ access: props.access }, {
    staleTime: 1000 * 60,
    refetchOnWindowFocus: false,
  });
@@ -26,7 +31,11 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
  const { isLoading: isDeleting, status: deleteStatus, error: deleteError, mutate: deleteMutate, reset: deleteReset } = apiQuery.llmOllama.adminDelete.useMutation();

  // derived state
-  const pullModelDescription = pullable?.pullable.find(p => p.id === modelName)?.description ?? null;
+  let pullable = pullableData?.pullable || [];
+  if (sortByPulls)
+    pullable = pullable.toSorted((a, b) => b.pulls - a.pulls);
+  const pullModelDescription = pullable.find(p => p.id === modelName)?.description ?? null;
+

  const handleModelPull = () => {
    deleteReset();
@@ -38,6 +47,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
    modelName && deleteMutate({ access: props.access, name: modelName + (modelTag ? ':' + modelTag : '') });
  };

+
  return (
    <GoodModal title='Ollama Administration' dividers open onClose={props.onClose}>

@@ -47,25 +57,48 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
          However we provide a way to pull models from the Ollama host, for convenience.
        </Typography>

-        <Box sx={{ display: 'flex', gap: 1 }}>
-          <FormControl sx={{ flexGrow: 1 }}>
+        <Box sx={{ display: 'flex', flexFlow: 'row wrap', gap: 1 }}>
+          <FormControl sx={{ flexGrow: 1, flexBasis: 0.55 }}>
            <FormLabelStart title='Name' />
-            <Select value={modelName || ''} onChange={(_event: any, value: string | null) => setModelName(value)}>
-              {pullable?.pullable.map(p =>
-                <Option key={p.id} value={p.id}>
-                  {p.isNew === true && <Chip size='sm' variant='outlined'>New</Chip>} {p.label}
-                </Option>,
-              )}
-            </Select>
+            <Box sx={{ display: 'flex', gap: 1 }}>
+              <Select
+                value={modelName || ''}
+                onChange={(_event: any, value: string | null) => setModelName(value)}
+                sx={{ flexGrow: 1 }}
+              >
+                {pullable.map(p =>
+                  <Option key={p.id} value={p.id}>
+                    {p.isNew === true && <Chip size='sm' variant='outlined'>NEW</Chip>} {p.label}{sortByPulls && ` (${p.pulls.toLocaleString()})`}
+                  </Option>,
+                )}
+              </Select>
+              <GoodTooltip title='Sort by Downloads'>
+                <IconButton
+                  variant={sortByPulls ? 'solid' : 'outlined'}
+                  onClick={() => setSortByPulls(!sortByPulls)}
+                >
+                  <FormatListNumberedRtlIcon />
+                </IconButton>
+              </GoodTooltip>
+            </Box>
          </FormControl>
-          <FormControl sx={{ flexGrow: 1 }}>
+          <FormControl sx={{ flexGrow: 1, flexBasis: 0.45 }}>
            <FormLabelStart title='Tag' />
-            <Input
-              variant='outlined' placeholder='latest'
-              value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
-              sx={{ minWidth: 100 }}
-              slotProps={{ input: { size: 10 } }} // halve the min width
-            />
+            <Box sx={{ display: 'flex', gap: 1 }}>
+              <Input
+                variant='outlined' placeholder='latest'
+                value={modelTag || ''} onChange={event => setModelTag(event.target.value)}
+                sx={{ minWidth: 80, flexGrow: 1 }}
+                slotProps={{ input: { size: 10 } }} // halve the min width
+              />
+              {!!modelName && (
+                <IconButton
+                  component={Link} href={`https://ollama.ai/library/${modelName}`} target='_blank'
+                >
+                  <LaunchIcon />
+                </IconButton>
+              )}
+            </Box>
          </FormControl>
        </Box>

@@ -85,7 +118,7 @@ export function OllamaAdmin(props: { access: OllamaAccessSchema, onClose: () =>
            {pullModelDescription}
          </Typography>

-          <Box sx={{ display: 'flex', gap: 1 }}>
+          <Box sx={{ display: 'flex', flexWrap: 1, gap: 1 }}>
            <Button
              variant='outlined'
              color={deleteStatus === 'error' ? 'danger' : deleteStatus === 'success' ? 'success' : 'primary'}
@@ -11,7 +11,7 @@ import { asValidURL } from '~/common/util/urlUtils';

 import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
 import { ModelVendorOllama } from './ollama.vendor';
-import { OllamaAdmin } from './OllamaAdmin';
+import { OllamaAdministration } from './OllamaAdministration';
 import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';


@@ -63,7 +63,7 @@ export function OllamaSourceSetup(props: { sourceId: DModelSourceId }) {

    {isError && <InlineError error={error} />}

-    {adminOpen && <OllamaAdmin access={access} onClose={() => setAdminOpen(false)} />}
+    {adminOpen && <OllamaAdministration access={access} onClose={() => setAdminOpen(false)} />}

  </>;
 }
@@ -1,12 +1,13 @@
 import * as React from 'react';

-import { Typography } from '@mui/joy';
+import { Button, Typography } from '@mui/joy';

 import { FormInputKey } from '~/common/components/forms/FormInputKey';
 import { InlineError } from '~/common/components/InlineError';
 import { Link } from '~/common/components/Link';
 import { SetupFormRefetchButton } from '~/common/components/forms/SetupFormRefetchButton';
 import { apiQuery } from '~/common/util/trpc.client';
+import { getCallbackUrl } from '~/common/app.routes';

 import { DModelSourceId, useModelsStore, useSourceSetup } from '../../store-llms';
 import { modelDescriptionToDLLM } from '../openai/OpenAISourceSetup';
@@ -38,6 +39,16 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
    staleTime: Infinity,
  });

+
+  const handleOpenRouterLogin = () => {
+    // replace the current page with the OAuth page
+    const callbackUrl = getCallbackUrl('openrouter');
+    const oauthUrl = 'https://openrouter.ai/auth?callback_url=' + encodeURIComponent(callbackUrl);
+    window.open(oauthUrl, '_self');
+    // ...bye / see you soon at the callback location...
+  };
+
+
  return <>

    {/*<Box sx={{ display: 'flex', gap: 1, alignItems: 'center' }}>*/}
@@ -53,7 +64,7 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
    <FormInputKey
      id='openrouter-key' label='OpenRouter API Key'
      rightLabel={<>{needsUserKey
-        ? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>create key</Link>
+        ? !oaiKey && <Link level='body-sm' href='https://openrouter.ai/keys' target='_blank'>your keys</Link>
        : '✔️ already set in server'
      } {oaiKey && keyValid && <Link level='body-sm' href='https://openrouter.ai/activity' target='_blank'>check usage</Link>}
      </>}
@@ -62,7 +73,14 @@ export function OpenRouterSourceSetup(props: { sourceId: DModelSourceId }) {
      placeholder='sk-or-...'
    />

-    <SetupFormRefetchButton refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError} />
+    <SetupFormRefetchButton
+      refetch={refetch} disabled={!shallFetchSucceed || isFetching} error={isError}
+      leftButton={
+        <Button color='neutral' variant={(needsUserKey && !keyValid) ? 'solid' : 'outlined'} onClick={handleOpenRouterLogin}>
+          OpenRouter Login
+        </Button>
+      }
+    />

    {isError && <InlineError error={error} />}
Author	SHA1	Message	Date
Enrico Ros	6053636f66	OpenRouter: OAuth login support	2023-12-11 22:35:40 -08:00
Enrico Ros	f2e2aee672	1.7.2: Stable Patch Version	2023-12-11 21:22:31 -08:00
Enrico Ros	11cbb2bbf0	OpenRouter: update models	2023-12-11 21:21:22 -08:00
Enrico Ros	30bd19d6ce	HTML Table to Markdown Table: improve reliability and ignore hidden data	2023-12-11 20:46:34 -08:00
Enrico Ros	d0b5c02062	Improve how Stream errors are shown	2023-12-11 18:22:15 -08:00
Enrico Ros	771192e406	Ollama: support ollama errors via API	2023-12-11 18:19:38 -08:00
Enrico Ros	13f502bd76	1.7.1: Release (Ollama chat). #270	2023-12-10 22:17:35 -08:00
Enrico Ros	11055b12ca	Ollama: use the new Chat endpoint. Closes #270	2023-12-10 22:12:51 -08:00
Enrico Ros	d0ea96eec0	Ollama: Admin: optional sort by Pulls, and UI link to the Model page	2023-12-10 22:03:55 -08:00
Enrico Ros	02eafc03f1	Ollama: update models, and sort by Featured	2023-12-10 22:01:50 -08:00
Enrico Ros	33d07a0313	Ollama: update documentation	2023-12-10 21:30:30 -08:00
Enrico Ros	763b852148	Ollama: administration: external link	2023-12-10 20:24:20 -08:00
Enrico Ros	d5b0617fd7	Comment for now	2023-12-10 06:14:49 -08:00
Enrico Ros	e3ce83674c	Update Ollama	2023-12-10 06:09:54 -08:00