Compare commits
10 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| a1ced4d1a9 | |||
| 02d6698708 | |||
| 5e17381ea6 | |||
| 3aecbd71d1 | |||
| cab2e2c4de | |||
| cb3f64366b | |||
| 1dac06af1d | |||
| 822110ca00 | |||
| 54eff150d3 | |||
| 2a4675a139 |
@@ -1,7 +1,9 @@
|
||||
# Warning
|
||||
**I strongly suggest against using this feature with a Google account that you care about.** Depending on the content of the prompts people submit, Google may flag the spreadsheet as containing inappropriate content. This seems to prevent you from sharing that spreadsheet _or any others on the account. This happened with my throwaway account during testing; the existing shared spreadsheet continues to work but even completely new spreadsheets are flagged and cannot be shared.
|
||||
# ⚠️ Warning ⚠️
|
||||
**I strongly suggest against using this feature with a Google account that you care about.** Depending on the content of the prompts people submit (which you obviously have no control over), Google may flag the spreadsheet as containing inappropriate content. If this happens, Google may suspend your ability to share the spreadsheet, block access to Google Sheets, or even suspend your entire Google account (this happened to my throwaway, though it may have been because it was very clearly a throwaway and used a burner SMS number).
|
||||
|
||||
I'll be looking into alternative storage backends but you should not use this implementation with a Google account you care about, or even one remotely connected to your main accounts (as Google has a history of linking accounts together via IPs/browser fingerprinting). Use a VPN and completely isolated VM to be safe.
|
||||
**Be aware that Google has been known to link accounts through device/browser fingerprinting, so even a VPN may not be sufficient; if you must use this feature, do so entirely from an isolated VM and VPN with no other Google accounts logged in.**
|
||||
|
||||
There are now other logging options available, so you should use those instead. I'm leaving this here for posterity, but I will not be providing any support for it.
|
||||
|
||||
# Configuring Google Sheets Prompt Logging
|
||||
This proxy can log incoming prompts and model responses to Google Sheets. Some configuration on the Google side is required to enable this feature. The APIs used are free, but you will need a Google account and a Google Cloud Platform project.
|
||||
@@ -10,7 +12,7 @@ NOTE: Concurrency is not supported. Don't connect two instances of the server to
|
||||
|
||||
## Prerequisites
|
||||
- A Google account
|
||||
- **USE A THROWAWAY ACCOUNT!**
|
||||
- **⚠️ USE A THROWAWAY ACCOUNT!**
|
||||
- A Google Cloud Platform project
|
||||
|
||||
### 0. Create a Google Cloud Platform Project
|
||||
|
||||
@@ -0,0 +1,41 @@
|
||||
# Prompt Logging
|
||||
|
||||
This proxy supports logging incoming prompts and model responses to different destinations. Currently, Airtable and Google Sheets (not recommended) are supported. You can enable prompt logging by setting the `PROMPT_LOGGING` environment variable to `true` and configuring the `PROMPT_LOGGING_BACKEND` environment variable to the desired logging backend.
|
||||
|
||||
The included backends are generally designed with the goal of working within the limitations of a service's free tier, such as strict API rate limits or maximum record limits. As a result, they may be a little clunky to use and may not be as performant as a dedicated logging solution, but they should be sufficient for low-volume use cases. You can implement your own backend by exporting a module that implements the `PromptLoggingBackend` interface and wiring it up to `src/prompt-logging/log-queue.ts`.
|
||||
|
||||
Refer to the list below for the required configuration for each backend.
|
||||
|
||||
## Airtable
|
||||
|
||||
1. Create an Airtable.com account
|
||||
2. Create a Personal Access Token
|
||||
1. Go to https://airtable.com/create/tokens/new and enter a name for your token
|
||||
2. Under **Scopes**, click **Add a scope** and assign the following scopes:
|
||||
- `data.records:read`
|
||||
- `data.records:write`
|
||||
- `schema.bases:read`
|
||||
- `schema.bases:write`
|
||||
3. Under **Access**, click **Add a base** and assign "All current and future bases in this workspace"
|
||||
- Create a new workspace for prompt logging if you don't want to give the script access to all your bases
|
||||
4. Click **Create token**
|
||||
5. A modal will appear with your token; copy it and set is as the `AIRTABLE_KEY` environment variable
|
||||
3. Find your workspace ID
|
||||
- You can find your workspace ID by going to https://airtable.com/workspaces and selecting **View Workspace** on the workspace you want to use
|
||||
- The ID is the text beginning with `wsp` in the URL, after `airtable.com/workspaces/`
|
||||
- Set this value as the `AIRTABLE_WORKSPACE_ID` environment variable
|
||||
4. Set the `PROMPT_LOGGING_BACKEND` environment variable to `airtable`
|
||||
|
||||
The proxy will handle creating and migrating bases for you. The following bases will be created in the workspace you select:
|
||||
|
||||
- `oai-proxy-index`
|
||||
- Stores metadata about the proxy and the bases it creates
|
||||
- `oai-proxy-logs-*`
|
||||
- Stores prompt logs
|
||||
- As free bases are limited in size, the proxy will create additional bases as needed
|
||||
|
||||
## Google Sheets (deprecated)
|
||||
|
||||
**⚠️ This implementation is strongly discouraged** due to the nature of content users may submit, which may be in violation of Google's policies. They seem to analyze the content of API requests and may suspend your account. Don't use this unless you know what you're doing.
|
||||
|
||||
Refer to the dedicated [Google Sheets docs](logging-sheets.md) for detailed instructions on how to set up Google Sheets logging.
|
||||
Generated
+26
@@ -9,6 +9,7 @@
|
||||
"version": "1.0.0",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"airtable": "^0.12.1",
|
||||
"axios": "^1.3.5",
|
||||
"cors": "^2.8.5",
|
||||
"dotenv": "^16.0.3",
|
||||
@@ -934,6 +935,11 @@
|
||||
"node": ">=6.5"
|
||||
}
|
||||
},
|
||||
"node_modules/abortcontroller-polyfill": {
|
||||
"version": "1.7.5",
|
||||
"resolved": "https://registry.npmjs.org/abortcontroller-polyfill/-/abortcontroller-polyfill-1.7.5.tgz",
|
||||
"integrity": "sha512-JMJ5soJWP18htbbxJjG7bG6yuI6pRhgJ0scHHTfkUjf6wjP912xZWvM+A4sJK3gqd9E8fcPbDnOefbA9Th/FIQ=="
|
||||
},
|
||||
"node_modules/accepts": {
|
||||
"version": "1.3.8",
|
||||
"resolved": "https://registry.npmjs.org/accepts/-/accepts-1.3.8.tgz",
|
||||
@@ -1008,6 +1014,26 @@
|
||||
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.2.tgz",
|
||||
"integrity": "sha512-sGkPx+VjMtmA6MX27oA4FBFELFCZZ4S4XqeGOXCv68tT+jb3vk/RyaKWP0PTKyWtmLSM0b+adUTEvbs1PEaH2w=="
|
||||
},
|
||||
"node_modules/airtable": {
|
||||
"version": "0.12.1",
|
||||
"resolved": "https://registry.npmjs.org/airtable/-/airtable-0.12.1.tgz",
|
||||
"integrity": "sha512-wS49QIO46YjSUbRIslX6pJaAGsdzOFPtYfaARYsBifsev10TDsyXc5IBYX6b3JQs4SZ8A5+g/vbQ5IfPvbnc+w==",
|
||||
"dependencies": {
|
||||
"@types/node": ">=8.0.0 <15",
|
||||
"abort-controller": "^3.0.0",
|
||||
"abortcontroller-polyfill": "^1.4.0",
|
||||
"lodash": "^4.17.21",
|
||||
"node-fetch": "^2.6.7"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=8.0.0"
|
||||
}
|
||||
},
|
||||
"node_modules/airtable/node_modules/@types/node": {
|
||||
"version": "14.18.47",
|
||||
"resolved": "https://registry.npmjs.org/@types/node/-/node-14.18.47.tgz",
|
||||
"integrity": "sha512-OuJi8bIng4wYHHA3YpKauL58dZrPxro3d0tabPHyiNF8rKfGKuVfr83oFlPLmKri1cX+Z3cJP39GXmnqkP11Gw=="
|
||||
},
|
||||
"node_modules/ansi-regex": {
|
||||
"version": "5.0.1",
|
||||
"resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-5.0.1.tgz",
|
||||
|
||||
@@ -18,6 +18,7 @@
|
||||
"author": "",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"airtable": "^0.12.1",
|
||||
"axios": "^1.3.5",
|
||||
"cors": "^2.8.5",
|
||||
"dotenv": "^16.0.3",
|
||||
|
||||
+22
-12
@@ -8,7 +8,6 @@ const startupLogger = pino({ level: "debug" }).child({ module: "startup" });
|
||||
|
||||
const isDev = process.env.NODE_ENV !== "production";
|
||||
|
||||
type PromptLoggingBackend = "google_sheets";
|
||||
export type DequeueMode = "fair" | "random" | "none";
|
||||
|
||||
type Config = {
|
||||
@@ -18,8 +17,6 @@ type Config = {
|
||||
openaiKey?: string;
|
||||
/** Comma-delimited list of Anthropic API keys. */
|
||||
anthropicKey?: string;
|
||||
scaleKey?: string;
|
||||
scaleMinDeployments: number;
|
||||
/**
|
||||
* The proxy key to require for requests. Only applicable if the user
|
||||
* management mode is set to 'proxy_key', and required if so.
|
||||
@@ -28,7 +25,7 @@ type Config = {
|
||||
/**
|
||||
* The admin key used to access the /admin API. Required if the user
|
||||
* management mode is set to 'user_token'.
|
||||
*/
|
||||
**/
|
||||
adminKey?: string;
|
||||
/**
|
||||
* Which user management mode to use.
|
||||
@@ -51,7 +48,7 @@ type Config = {
|
||||
*
|
||||
* `firebase_rtdb`: Users are stored in a Firebase Realtime Database; requires
|
||||
* `firebaseKey` and `firebaseRtdbUrl` to be set.
|
||||
*/
|
||||
**/
|
||||
gatekeeperStore: "memory" | "firebase_rtdb";
|
||||
/** URL of the Firebase Realtime Database if using the Firebase RTDB store. */
|
||||
firebaseRtdbUrl?: string;
|
||||
@@ -77,12 +74,22 @@ type Config = {
|
||||
logLevel?: "debug" | "info" | "warn" | "error";
|
||||
/** Whether prompts and responses should be logged to persistent storage. */
|
||||
promptLogging?: boolean;
|
||||
/** Which prompt logging backend to use. */
|
||||
promptLoggingBackend?: PromptLoggingBackend;
|
||||
/** Which prompt logging backend to use.
|
||||
*
|
||||
* `google_sheets`: Logs prompts and responses to a Google Sheets spreadsheet.
|
||||
* This method is no longer recommended; see docs for more info.
|
||||
*
|
||||
* `airtable`: Logs prompts and responses to an Airtable table.
|
||||
*/
|
||||
promptLoggingBackend?: "google_sheets" | "airtable";
|
||||
/** Base64-encoded Google Sheets API key. */
|
||||
googleSheetsKey?: string;
|
||||
/** Google Sheets spreadsheet ID. */
|
||||
googleSheetsSpreadsheetId?: string;
|
||||
/** Airtable personal access token. */
|
||||
airtableKey?: string;
|
||||
/** Airtable workspace ID, under which bases will be automatically created. */
|
||||
airtableWorkspaceId?: string;
|
||||
/** Whether to periodically check keys for usage and validity. */
|
||||
checkKeys?: boolean;
|
||||
/**
|
||||
@@ -129,8 +136,6 @@ export const config: Config = {
|
||||
port: getEnvWithDefault("PORT", 7860),
|
||||
openaiKey: getEnvWithDefault("OPENAI_KEY", ""),
|
||||
anthropicKey: getEnvWithDefault("ANTHROPIC_KEY", ""),
|
||||
scaleKey: getEnvWithDefault("SCALE_KEY", ""),
|
||||
scaleMinDeployments: getEnvWithDefault("SCALE_MIN_DEPLOYMENTS", 0),
|
||||
proxyKey: getEnvWithDefault("PROXY_KEY", ""),
|
||||
adminKey: getEnvWithDefault("ADMIN_KEY", ""),
|
||||
gatekeeper: getEnvWithDefault("GATEKEEPER", "none"),
|
||||
@@ -154,6 +159,8 @@ export const config: Config = {
|
||||
quotaDisplayMode: getEnvWithDefault("QUOTA_DISPLAY_MODE", "partial"),
|
||||
promptLogging: getEnvWithDefault("PROMPT_LOGGING", false),
|
||||
promptLoggingBackend: getEnvWithDefault("PROMPT_LOGGING_BACKEND", undefined),
|
||||
airtableKey: getEnvWithDefault("AIRTABLE_KEY", undefined),
|
||||
airtableWorkspaceId: getEnvWithDefault("AIRTABLE_WORKSPACE_ID", undefined),
|
||||
googleSheetsKey: getEnvWithDefault("GOOGLE_SHEETS_KEY", undefined),
|
||||
googleSheetsSpreadsheetId: getEnvWithDefault(
|
||||
"GOOGLE_SHEETS_SPREADSHEET_ID",
|
||||
@@ -238,7 +245,7 @@ export async function assertConfigIsValid() {
|
||||
// Ensure forks which add new secret-like config keys don't unwittingly expose
|
||||
// them to users.
|
||||
for (const key of getKeys(config)) {
|
||||
const maybeSensitive = ["key", "credentials", "secret", "password"].some(
|
||||
const maybeSensitive = ["key", "credential", "secret", "password"].some(
|
||||
(sensitive) => key.toLowerCase().includes(sensitive)
|
||||
);
|
||||
const secured = new Set([...SENSITIVE_KEYS, ...OMITTED_KEYS]);
|
||||
@@ -255,7 +262,10 @@ export async function assertConfigIsValid() {
|
||||
* Config keys that are masked on the info page, but not hidden as their
|
||||
* presence may be relevant to the user due to privacy implications.
|
||||
*/
|
||||
export const SENSITIVE_KEYS: (keyof Config)[] = ["googleSheetsSpreadsheetId"];
|
||||
export const SENSITIVE_KEYS: (keyof Config)[] = [
|
||||
"googleSheetsSpreadsheetId",
|
||||
"airtableWorkspaceId",
|
||||
];
|
||||
|
||||
/**
|
||||
* Config keys that are not displayed on the info page at all, generally because
|
||||
@@ -266,12 +276,12 @@ export const OMITTED_KEYS: (keyof Config)[] = [
|
||||
"logLevel",
|
||||
"openaiKey",
|
||||
"anthropicKey",
|
||||
"scaleKey",
|
||||
"proxyKey",
|
||||
"adminKey",
|
||||
"checkKeys",
|
||||
"quotaDisplayMode",
|
||||
"googleSheetsKey",
|
||||
"airtableKey",
|
||||
"firebaseKey",
|
||||
"firebaseRtdbUrl",
|
||||
"gatekeeperStore",
|
||||
|
||||
@@ -5,7 +5,7 @@ import {
|
||||
} from "./anthropic/provider";
|
||||
import { KeyPool } from "./key-pool";
|
||||
|
||||
export type AIService = "openai" | "anthropic" | "scale";
|
||||
export type AIService = "openai" | "anthropic";
|
||||
export type Model = OpenAIModel | AnthropicModel;
|
||||
|
||||
export interface Key {
|
||||
|
||||
@@ -128,8 +128,8 @@ export class OpenAIKeyProvider implements KeyProvider<OpenAIKey> {
|
||||
);
|
||||
if (availableKeys.length === 0) {
|
||||
let message = needGpt4
|
||||
? "No GPT-4 keys available. Try selecting a non-GPT-4 model."
|
||||
: "No active OpenAI keys available.";
|
||||
? "No active OpenAI keys available."
|
||||
: "No GPT-4 keys available. Try selecting a non-GPT-4 model.";
|
||||
throw new Error(message);
|
||||
}
|
||||
|
||||
|
||||
@@ -1,155 +0,0 @@
|
||||
import crypto from "crypto";
|
||||
import { Key, KeyProvider } from "..";
|
||||
import { config } from "../../config";
|
||||
import { logger } from "../../logger";
|
||||
|
||||
export interface ScaleDeployment extends Key {
|
||||
readonly service: "scale";
|
||||
deploymentUrl: string;
|
||||
createdAt: number;
|
||||
}
|
||||
|
||||
/*
|
||||
Scale is a bit different from the other providers. It doesn't have set API keys;
|
||||
instead there are "deployments", which are created in the Scale dashboard and
|
||||
are accessible via a URL and API key together.
|
||||
|
||||
The operator can provide these accounts via the SCALE_KEY environment variable,
|
||||
but more likely they will want the proxy to just automatically create new
|
||||
accounts and deployments as older ones reach their usage limits.
|
||||
*/
|
||||
|
||||
export class ScaleKeyProvider implements KeyProvider<ScaleDeployment> {
|
||||
readonly service = "scale";
|
||||
|
||||
private deployments: ScaleDeployment[] = [];
|
||||
private log = logger.child({ module: "key-provider", service: this.service });
|
||||
private churnerEnabled = false;
|
||||
|
||||
constructor() {
|
||||
const keyConfig = config.scaleKey?.trim();
|
||||
if (!keyConfig) return;
|
||||
let initialKeys: string[];
|
||||
initialKeys = [...new Set(keyConfig.split(",").map((k) => k.trim()))];
|
||||
for (const keyStr of initialKeys) {
|
||||
const [key, deploymentUrl] = keyStr.split("$");
|
||||
const newDeployment: ScaleDeployment = {
|
||||
key,
|
||||
deploymentUrl,
|
||||
service: this.service,
|
||||
isGpt4: false,
|
||||
isTrial: false,
|
||||
isDisabled: false,
|
||||
promptCount: 0,
|
||||
lastUsed: 0,
|
||||
createdAt: Date.now(),
|
||||
hash: `sca-${crypto
|
||||
.createHash("sha256")
|
||||
.update(keyStr)
|
||||
.digest("hex")
|
||||
.slice(0, 8)}`,
|
||||
lastChecked: 0,
|
||||
};
|
||||
this.deployments.push(newDeployment);
|
||||
}
|
||||
this.log.info(
|
||||
{ keyCount: this.deployments.length },
|
||||
"Loaded initial Scale deployments"
|
||||
);
|
||||
}
|
||||
|
||||
public init() {
|
||||
// TODO: Start account churner
|
||||
this.churnerEnabled = true;
|
||||
}
|
||||
|
||||
public list() {
|
||||
return this.deployments.map((k) => Object.freeze({ ...k, key: undefined }));
|
||||
}
|
||||
|
||||
public get(_model: unknown) {
|
||||
// Scale doesn't support changing models on the fly
|
||||
const availableDeployments = this.deployments.filter((a) => !a.isDisabled);
|
||||
const canCreateNewAccounts = config.scaleMinDeployments > 0;
|
||||
if (availableDeployments.length === 0) {
|
||||
if (canCreateNewAccounts) {
|
||||
this.log.warn(
|
||||
"Ran out of Scale deployments and the churner is not creating new ones fast enough."
|
||||
);
|
||||
throw new Error(
|
||||
"No Scale deployments available. Try again in a few minutes when the churner has created new deployments."
|
||||
);
|
||||
} else {
|
||||
throw new Error(
|
||||
"No Scale deployments available and account churner is disabled (possible IP ban or signup rate limit)."
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Unlike other providers, Scale doesn't want to rotate keys. Instead, we
|
||||
// want to use the same key for as long as possible while building up a
|
||||
// reserve of new accounts. Once an account dies there should be a fresh
|
||||
// one ready to go.
|
||||
|
||||
const now = Date.now();
|
||||
|
||||
const deploymentsByPriority = availableDeployments.sort((a, b) => {
|
||||
return a.createdAt - b.createdAt;
|
||||
});
|
||||
|
||||
const selectedKey = deploymentsByPriority[0];
|
||||
selectedKey.lastUsed = now;
|
||||
return { ...selectedKey };
|
||||
}
|
||||
|
||||
public disable(deployment: ScaleDeployment) {
|
||||
const deploymentFromPool = this.deployments.find(
|
||||
(d) => d.hash === deployment.hash
|
||||
);
|
||||
if (!deploymentFromPool || deploymentFromPool.isDisabled) return;
|
||||
deploymentFromPool.isDisabled = true;
|
||||
this.log.warn({ key: deployment.hash }, "Scale deployment disabled");
|
||||
}
|
||||
|
||||
public update(hash: string, update: Partial<ScaleDeployment>) {
|
||||
const deploymentFromPool = this.deployments.find((d) => d.hash === hash)!;
|
||||
Object.assign(deploymentFromPool, update);
|
||||
}
|
||||
|
||||
public available() {
|
||||
return this.deployments.filter((k) => !k.isDisabled).length;
|
||||
}
|
||||
|
||||
// Normally this would return the number of unchecked keys but we will
|
||||
// repurpose it to return the number of pending accounts the churner is
|
||||
// creating.
|
||||
public anyUnchecked() {
|
||||
return config.scaleMinDeployments - this.available() > 0;
|
||||
}
|
||||
|
||||
public incrementPrompt(hash?: string) {
|
||||
const deployment = this.deployments.find((d) => d.hash === hash);
|
||||
if (!deployment) return;
|
||||
deployment.promptCount++;
|
||||
}
|
||||
|
||||
public getLockoutPeriod(_model: unknown) {
|
||||
// TODO: Scale doesn't have rate limits but this may need to be repurposed
|
||||
// to lock out the request queue if the account churner enabled but falling
|
||||
// behind.
|
||||
return 0;
|
||||
}
|
||||
|
||||
public markRateLimited(keyHash: string) {
|
||||
// Do nothing
|
||||
}
|
||||
|
||||
/** Doesn't really mean anything for Scale */
|
||||
public remainingQuota() {
|
||||
return 1;
|
||||
}
|
||||
|
||||
public usageInUsd() {
|
||||
return "$0.00 / ∞";
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,226 @@
|
||||
import Airtable from "airtable";
|
||||
import axios, { AxiosError } from "axios";
|
||||
import { config } from "../../config";
|
||||
import { logger } from "../../logger";
|
||||
import { PromptLogBackend, PromptLogEntry } from "..";
|
||||
|
||||
type AirbaseFieldType =
|
||||
| "singleLineText"
|
||||
| "multilineText"
|
||||
| "number"
|
||||
| "dateTime";
|
||||
|
||||
type IndexRecord = {
|
||||
/** Name of the base */
|
||||
id: string;
|
||||
/** Schema version of the base */
|
||||
schema: 1;
|
||||
/** Last row number used */
|
||||
lastRow: number;
|
||||
/** When the base was created. ISO 8601 format. */
|
||||
created: string;
|
||||
};
|
||||
|
||||
const INDEX_BASE_NAME = "oai-proxy-index";
|
||||
|
||||
export class AirtableBackend implements PromptLogBackend {
|
||||
private log = logger.child({ module: "airtable" });
|
||||
private airtable: Airtable;
|
||||
private indexBase: Airtable.Base | null = null;
|
||||
private indexTable: Airtable.Table<IndexRecord> | null = null;
|
||||
private activeLogBase: Airtable.Base | null = null;
|
||||
private activeLogTable: Airtable.Table<PromptLogEntry> | null = null;
|
||||
|
||||
constructor() {
|
||||
this.airtable = new Airtable({
|
||||
apiKey: config.airtableKey,
|
||||
requestTimeout: 1000 * 60 * 1,
|
||||
});
|
||||
}
|
||||
|
||||
async init() {
|
||||
this.log.info("Initializing Airtable backend...");
|
||||
await this.ensureIndexBase();
|
||||
await this.ensureLogBase();
|
||||
}
|
||||
|
||||
private async ensureIndexBase() {
|
||||
const bases = await this.listBases();
|
||||
const indexBaseId = bases.find((b) => b.name === INDEX_BASE_NAME)?.id;
|
||||
if (!indexBaseId) {
|
||||
this.log.info("Creating index base.");
|
||||
const result = await this.createBase(INDEX_BASE_NAME, [
|
||||
{ name: "id", type: "singleLineText" },
|
||||
{ name: "schema", type: "number" },
|
||||
{ name: "lastRow", type: "number" },
|
||||
{ name: "created", type: "dateTime" },
|
||||
]);
|
||||
this.log.info("Index base created.");
|
||||
this.indexBase = this.airtable.base(result);
|
||||
this.indexTable = this.indexBase.table<IndexRecord>(INDEX_BASE_NAME);
|
||||
} else {
|
||||
this.log.info("Index base already exists.");
|
||||
this.indexBase = this.airtable.base(indexBaseId);
|
||||
this.indexTable = this.indexBase.table<IndexRecord>(INDEX_BASE_NAME);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the active log base to the newest one in the index, unless there are
|
||||
* no bases or the newest one is already full. Creates a new base if needed.
|
||||
*/
|
||||
private async ensureLogBase() {
|
||||
const indexRecords = await this.indexTable!.select().all();
|
||||
if (indexRecords.length === 0) {
|
||||
this.log.info("No log bases found, creating a new one.");
|
||||
await this.createLogBase();
|
||||
} else {
|
||||
const newestBase = indexRecords.reduce((a, b) => {
|
||||
const aDate = new Date(a.get("created"));
|
||||
const bDate = new Date(b.get("created"));
|
||||
return aDate > bDate ? a : b;
|
||||
});
|
||||
const lastRow = newestBase.get("lastRow");
|
||||
if (lastRow >= 1000) {
|
||||
this.log.info(
|
||||
{ lastRow },
|
||||
"Last log base is full, creating a new one."
|
||||
);
|
||||
await this.createLogBase();
|
||||
} else if (this.activeLogBase === null) {
|
||||
const newestBaseId = newestBase.get("id");
|
||||
this.log.info(
|
||||
{ activeLogBase: newestBaseId },
|
||||
"Setting active log base."
|
||||
);
|
||||
this.activeLogBase = this.airtable.base(newestBaseId);
|
||||
this.activeLogTable =
|
||||
this.activeLogBase.table<PromptLogEntry>(newestBaseId);
|
||||
} else {
|
||||
this.log.debug("Active log base already set.");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private async createLogBase() {
|
||||
const indexRecords = await this.indexTable!.select().all();
|
||||
const baseCount = indexRecords.length;
|
||||
const baseName = `oai-proxy-log-${baseCount.toString().padStart(3, "0")}`;
|
||||
this.log.info({ baseName }, "Creating new log base.");
|
||||
|
||||
const newBaseId = await this.createBase(baseName, [
|
||||
{ name: "model", type: "singleLineText" },
|
||||
{ name: "endpoint", type: "singleLineText" },
|
||||
{ name: "promptRaw", type: "multilineText" },
|
||||
{ name: "prompt", type: "multilineText" },
|
||||
{ name: "response", type: "multilineText" },
|
||||
]);
|
||||
this.activeLogBase = this.airtable.base(newBaseId);
|
||||
this.activeLogTable = this.activeLogBase.table<PromptLogEntry>(baseName);
|
||||
this.log.info({ baseName }, "New log base created and activated.");
|
||||
await this.indexTable!.create([
|
||||
{
|
||||
fields: {
|
||||
id: newBaseId,
|
||||
schema: 1,
|
||||
lastRow: 0,
|
||||
created: new Date().toISOString(),
|
||||
},
|
||||
},
|
||||
]);
|
||||
this.log.info({ baseName }, "New log base added to index.");
|
||||
}
|
||||
|
||||
/**
|
||||
* Appends a batch of entries to the log and updates the index. If the log
|
||||
* has reached its maximum size, a new log base will be created.
|
||||
*/
|
||||
async appendBatch(entries: PromptLogEntry[]) {
|
||||
if (!this.activeLogBase || !this.activeLogTable) {
|
||||
throw new Error("No active log base.");
|
||||
}
|
||||
// Airtable can only create 10 rows at a time, so we have to chunk it.
|
||||
const chunkSize = 10;
|
||||
const chunks = [];
|
||||
for (let i = 0; i < entries.length; i += chunkSize) {
|
||||
chunks.push(entries.slice(i, i + chunkSize));
|
||||
}
|
||||
this.log.info(
|
||||
{ batchSize: entries.length, chunks: chunks.length },
|
||||
"Appending batch of log entries."
|
||||
);
|
||||
for (const chunk of chunks) {
|
||||
const records = chunk.map((entry) => ({
|
||||
fields: {
|
||||
model: entry.model,
|
||||
endpoint: entry.endpoint,
|
||||
promptRaw: entry.promptRaw,
|
||||
prompt: entry.promptFlattened,
|
||||
response: entry.response,
|
||||
},
|
||||
}));
|
||||
await this.activeLogTable.create(records);
|
||||
this.log.info(
|
||||
{ count: records.length },
|
||||
"Submitted chunk of log entries."
|
||||
);
|
||||
}
|
||||
await this.syncIndex();
|
||||
await this.ensureLogBase();
|
||||
}
|
||||
|
||||
async syncIndex() {
|
||||
if (!this.activeLogBase || !this.activeLogTable) {
|
||||
throw new Error("No active log base.");
|
||||
}
|
||||
const logRecords = await this.activeLogTable.select().all();
|
||||
const logCount = logRecords.length;
|
||||
// Update the index with the new row count, by the active log base ID.
|
||||
const indexRecords = await this.indexTable!.select({
|
||||
filterByFormula: `{id} = "${this.activeLogBase.getId()}"`,
|
||||
}).all();
|
||||
if (indexRecords.length !== 1) {
|
||||
throw new Error("Index record not found.");
|
||||
}
|
||||
const indexRecord = indexRecords[0];
|
||||
await this.indexTable!.update([
|
||||
{ id: indexRecord.id, fields: { lastRow: logCount } },
|
||||
]);
|
||||
}
|
||||
|
||||
// The airtable library doesn't support meta operations like listing or
|
||||
// creating bases, so we have to do that ourselves.
|
||||
|
||||
/**
|
||||
* Lists all bases in the workspace.
|
||||
* @returns Array of base objects with `id` and `name` properties.
|
||||
*/
|
||||
private async listBases(): Promise<{ id: string; name: string }[]> {
|
||||
// Maximum page size is 1000 but I'm not going to bother with that for now.
|
||||
const url = `https://api.airtable.com/v0/meta/bases`;
|
||||
const response = await axios.get(url, {
|
||||
headers: { Authorization: `Bearer ${config.airtableKey}` },
|
||||
});
|
||||
return response.data.bases;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a new base with the given name and table schema. Table will be
|
||||
* created with the same name as the base.
|
||||
* Schema is a list of fields, each of which has a name and type. Only a
|
||||
* subset of field types are supported.
|
||||
* Returns the id of the new base.
|
||||
*/
|
||||
private async createBase(
|
||||
name: string,
|
||||
fields: { name: string; type: AirbaseFieldType }[]
|
||||
) {
|
||||
const url = `https://api.airtable.com/v0/meta/bases`;
|
||||
const response = await axios.post(
|
||||
url,
|
||||
{ name, tables: [{ name, fields }] },
|
||||
{ headers: { Authorization: `Bearer ${config.airtableKey}` } }
|
||||
);
|
||||
return response.data.id;
|
||||
}
|
||||
}
|
||||
@@ -1 +1,19 @@
|
||||
export * as sheets from "./sheets";
|
||||
import { config } from "../../config";
|
||||
import { PromptLogBackend } from "..";
|
||||
import { AirtableBackend } from "./airtable";
|
||||
import { sheets } from "./sheets";
|
||||
|
||||
export const createPromptLogBackend = (
|
||||
backend: NonNullable<typeof config.promptLoggingBackend>
|
||||
): PromptLogBackend => {
|
||||
switch (backend) {
|
||||
case "google_sheets":
|
||||
// Sheets backend is just a module, though it has a bunch of state so it
|
||||
// should probably be a class just like the Airtable backend.
|
||||
return sheets;
|
||||
case "airtable":
|
||||
return new AirtableBackend();
|
||||
default:
|
||||
throw new Error(`Unknown log backend: ${backend}`);
|
||||
}
|
||||
};
|
||||
|
||||
@@ -10,7 +10,7 @@ import type { CredentialBody } from "google-auth-library";
|
||||
import type { GaxiosResponse } from "googleapis-common";
|
||||
import { config } from "../../config";
|
||||
import { logger } from "../../logger";
|
||||
import { PromptLogEntry } from "..";
|
||||
import { PromptLogBackend, PromptLogEntry } from "..";
|
||||
|
||||
// There is always a sheet called __index__ which contains a list of all the
|
||||
// other sheets. We use this rather than iterating over all the sheets in case
|
||||
@@ -240,7 +240,7 @@ const createLogSheet = async () => {
|
||||
activeLogSheet = { sheetName, rows: [] };
|
||||
};
|
||||
|
||||
export const appendBatch = async (batch: PromptLogEntry[]) => {
|
||||
const appendBatch = async (batch: PromptLogEntry[]) => {
|
||||
if (!activeLogSheet) {
|
||||
// Create a new log sheet if we don't have one yet.
|
||||
await createLogSheet();
|
||||
@@ -310,40 +310,7 @@ const finalizeBatch = async () => {
|
||||
log.info({ sheetName, rowCount }, "Batch finalized.");
|
||||
};
|
||||
|
||||
type LoadLogSheetArgs = {
|
||||
sheetName: string;
|
||||
/** The starting row to load. If omitted, loads all rows (expensive). */
|
||||
fromRow?: number;
|
||||
};
|
||||
|
||||
/** Not currently used. */
|
||||
export const loadLogSheet = async ({
|
||||
sheetName,
|
||||
fromRow = 2, // omit header row
|
||||
}: LoadLogSheetArgs) => {
|
||||
const client = sheetsClient!;
|
||||
const spreadsheetId = config.googleSheetsSpreadsheetId!;
|
||||
|
||||
const range = `${sheetName}!A${fromRow}:E`;
|
||||
const res = await client.spreadsheets.values.get({
|
||||
spreadsheetId: spreadsheetId,
|
||||
range,
|
||||
});
|
||||
const data = assertData(res);
|
||||
const values = data.values || [];
|
||||
const rows = values.slice(1).map((row) => {
|
||||
return {
|
||||
model: row[0],
|
||||
endpoint: row[1],
|
||||
promptRaw: row[2],
|
||||
promptFlattened: row[3],
|
||||
response: row[4],
|
||||
};
|
||||
});
|
||||
activeLogSheet = { sheetName, rows };
|
||||
};
|
||||
|
||||
export const init = async (onStop: () => void) => {
|
||||
const init = async (onStop: () => void) => {
|
||||
if (sheetsClient) {
|
||||
return;
|
||||
}
|
||||
@@ -420,3 +387,5 @@ function assertData<T = sheets_v4.Schema$ValueRange>(res: GaxiosResponse<T>) {
|
||||
}
|
||||
return res.data!;
|
||||
}
|
||||
|
||||
export const sheets = { init, appendBatch };
|
||||
|
||||
@@ -6,7 +6,7 @@ database for now.
|
||||
Due to the limitations of Google Sheets, we'll queue up log entries and flush
|
||||
them to the API periodically. */
|
||||
|
||||
export interface PromptLogEntry {
|
||||
export type PromptLogEntry = {
|
||||
model: string;
|
||||
endpoint: string;
|
||||
/** JSON prompt passed to the model */
|
||||
@@ -15,6 +15,11 @@ export interface PromptLogEntry {
|
||||
promptFlattened: string;
|
||||
response: string;
|
||||
// TODO: temperature, top_p, top_k, etc.
|
||||
};
|
||||
|
||||
export interface PromptLogBackend {
|
||||
init(onStop: () => void): Promise<void>;
|
||||
appendBatch(entries: PromptLogEntry[]): Promise<void>;
|
||||
}
|
||||
|
||||
export * as logQueue from "./log-queue";
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
/* Queues incoming prompts/responses and periodically flushes them to configured
|
||||
* logging backend. */
|
||||
|
||||
import { config } from "../config";
|
||||
import { logger } from "../logger";
|
||||
import { PromptLogEntry } from ".";
|
||||
import { sheets } from "./backends";
|
||||
import { PromptLogBackend, PromptLogEntry } from ".";
|
||||
import { createPromptLogBackend } from "./backends";
|
||||
|
||||
const FLUSH_INTERVAL = 1000 * 10;
|
||||
const MAX_BATCH_SIZE = 25;
|
||||
@@ -11,11 +11,19 @@ const MAX_BATCH_SIZE = 25;
|
||||
const queue: PromptLogEntry[] = [];
|
||||
const log = logger.child({ module: "log-queue" });
|
||||
|
||||
let activeBackend: PromptLogBackend | null = null;
|
||||
let started = false;
|
||||
let timeoutId: NodeJS.Timeout | null = null;
|
||||
let retrying = false;
|
||||
let consecutiveFailedBatches = 0;
|
||||
|
||||
const getBackend = () => {
|
||||
if (!activeBackend) {
|
||||
throw new Error("Log queue not initialized.");
|
||||
}
|
||||
return activeBackend;
|
||||
};
|
||||
|
||||
export const enqueue = (payload: PromptLogEntry) => {
|
||||
if (!started) {
|
||||
log.warn("Log queue not started, discarding incoming log entry.");
|
||||
@@ -34,7 +42,7 @@ export const flush = async () => {
|
||||
const nextBatch = queue.splice(0, batchSize);
|
||||
log.info({ size: nextBatch.length }, "Submitting new batch.");
|
||||
try {
|
||||
await sheets.appendBatch(nextBatch);
|
||||
await getBackend().appendBatch(nextBatch);
|
||||
retrying = false;
|
||||
consecutiveFailedBatches = 0;
|
||||
} catch (e: any) {
|
||||
@@ -65,7 +73,13 @@ export const flush = async () => {
|
||||
|
||||
export const start = async () => {
|
||||
try {
|
||||
await sheets.init(() => stop());
|
||||
const selectedBackend = config.promptLoggingBackend;
|
||||
if (!selectedBackend) {
|
||||
throw new Error("No logging backend configured.");
|
||||
}
|
||||
|
||||
activeBackend = createPromptLogBackend(selectedBackend);
|
||||
await getBackend().init(() => stop());
|
||||
log.info("Logging backend initialized.");
|
||||
started = true;
|
||||
} catch (e) {
|
||||
|
||||
@@ -9,12 +9,10 @@ import { handleProxyError } from "./middleware/common";
|
||||
import {
|
||||
addKey,
|
||||
addAnthropicPreamble,
|
||||
blockZoomerOrigins,
|
||||
createPreprocessorMiddleware,
|
||||
finalizeBody,
|
||||
languageFilter,
|
||||
limitOutputTokens,
|
||||
removeOriginHeaders,
|
||||
} from "./middleware/request";
|
||||
import {
|
||||
ProxyResHandlerWithBody,
|
||||
@@ -75,8 +73,6 @@ const rewriteAnthropicRequest = (
|
||||
addAnthropicPreamble,
|
||||
languageFilter,
|
||||
limitOutputTokens,
|
||||
blockZoomerOrigins,
|
||||
removeOriginHeaders,
|
||||
finalizeBody,
|
||||
];
|
||||
|
||||
|
||||
@@ -2,6 +2,7 @@ import { Request, Response } from "express";
|
||||
import httpProxy from "http-proxy";
|
||||
import { ZodError } from "zod";
|
||||
|
||||
|
||||
const OPENAI_CHAT_COMPLETION_ENDPOINT = "/v1/chat/completions";
|
||||
const ANTHROPIC_COMPLETION_ENDPOINT = "/v1/complete";
|
||||
|
||||
@@ -31,14 +32,9 @@ export function writeErrorResponse(
|
||||
res.headersSent ||
|
||||
res.getHeader("content-type") === "text/event-stream"
|
||||
) {
|
||||
const errorContent =
|
||||
statusCode === 403
|
||||
? JSON.stringify(errorPayload)
|
||||
: JSON.stringify(errorPayload, null, 2);
|
||||
|
||||
const msg = buildFakeSseMessage(
|
||||
`${errorSource} error (${statusCode})`,
|
||||
errorContent,
|
||||
JSON.stringify(errorPayload, null, 2),
|
||||
req
|
||||
);
|
||||
res.write(msg);
|
||||
@@ -61,7 +57,6 @@ export const handleInternalError = (
|
||||
) => {
|
||||
try {
|
||||
const isZod = err instanceof ZodError;
|
||||
const isForbidden = err.name === "ForbiddenError";
|
||||
if (isZod) {
|
||||
writeErrorResponse(req, res, 400, {
|
||||
error: {
|
||||
@@ -72,17 +67,6 @@ export const handleInternalError = (
|
||||
message: err.message,
|
||||
},
|
||||
});
|
||||
} else if (isForbidden) {
|
||||
// Spoofs a vaguely threatening OpenAI error message. Only invoked by the
|
||||
// block-zoomers rewriter to scare off tiktokers.
|
||||
writeErrorResponse(req, res, 403, {
|
||||
error: {
|
||||
type: "organization_account_disabled",
|
||||
code: "policy_violation",
|
||||
param: null,
|
||||
message: err.message,
|
||||
},
|
||||
});
|
||||
} else {
|
||||
writeErrorResponse(req, res, 500, {
|
||||
error: {
|
||||
@@ -107,14 +91,10 @@ export function buildFakeSseMessage(
|
||||
req: Request
|
||||
) {
|
||||
let fakeEvent;
|
||||
const useBackticks = !type.includes("403");
|
||||
const msgContent = useBackticks
|
||||
? `\`\`\`\n[${type}: ${string}]\n\`\`\`\n`
|
||||
: `[${type}: ${string}]`;
|
||||
|
||||
if (req.inboundApi === "anthropic") {
|
||||
fakeEvent = {
|
||||
completion: msgContent,
|
||||
completion: `\`\`\`\n[${type}: ${string}]\n\`\`\`\n`,
|
||||
stop_reason: type,
|
||||
truncated: false, // I've never seen this be true
|
||||
stop: null,
|
||||
@@ -129,7 +109,7 @@ export function buildFakeSseMessage(
|
||||
model: req.body?.model,
|
||||
choices: [
|
||||
{
|
||||
delta: { content: msgContent },
|
||||
delta: { content: `\`\`\`\n[${type}: ${string}]\n\`\`\`\n` },
|
||||
index: 0,
|
||||
finish_reason: type,
|
||||
},
|
||||
|
||||
@@ -1,34 +0,0 @@
|
||||
import { isCompletionRequest } from "../common";
|
||||
import { ProxyRequestMiddleware } from ".";
|
||||
|
||||
const DISALLOWED_ORIGIN_SUBSTRINGS = "janitorai.com,janitor.ai".split(",");
|
||||
|
||||
class ForbiddenError extends Error {
|
||||
constructor(message: string) {
|
||||
super(message);
|
||||
this.name = "ForbiddenError";
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Blocks requests from Janitor AI users with a fake, scary error message so I
|
||||
* stop getting emails asking for tech support.
|
||||
*/
|
||||
export const blockZoomerOrigins: ProxyRequestMiddleware = (_proxyReq, req) => {
|
||||
if (!isCompletionRequest(req)) {
|
||||
return;
|
||||
}
|
||||
|
||||
const origin = req.headers.origin || req.headers.referer;
|
||||
if (origin && DISALLOWED_ORIGIN_SUBSTRINGS.some((s) => origin.includes(s))) {
|
||||
// Venus-derivatives send a test prompt to check if the proxy is working.
|
||||
// We don't want to block that just yet.
|
||||
if (req.body.messages[0]?.content === "Just say TEST") {
|
||||
return;
|
||||
}
|
||||
|
||||
throw new ForbiddenError(
|
||||
`Your access was terminated due to violation of our policies, please check your email for more information. If you believe this is in error and would like to appeal, please contact us through our help center at help.openai.com.`
|
||||
);
|
||||
}
|
||||
};
|
||||
@@ -10,12 +10,10 @@ export { transformOutboundPayload } from "./transform-outbound-payload";
|
||||
// HPM middleware (runs on onProxyReq, cannot be async)
|
||||
export { addKey } from "./add-key";
|
||||
export { addAnthropicPreamble } from "./add-anthropic-preamble";
|
||||
export { blockZoomerOrigins } from "./block-zoomer-origins";
|
||||
export { finalizeBody } from "./finalize-body";
|
||||
export { languageFilter } from "./language-filter";
|
||||
export { limitCompletions } from "./limit-completions";
|
||||
export { limitOutputTokens } from "./limit-output-tokens";
|
||||
export { removeOriginHeaders } from "./remove-origin-headers";
|
||||
export { transformKoboldPayload } from "./transform-kobold-payload";
|
||||
|
||||
/**
|
||||
|
||||
@@ -1,10 +0,0 @@
|
||||
import { ProxyRequestMiddleware } from ".";
|
||||
|
||||
/**
|
||||
* Removes origin and referer headers before sending the request to the API for
|
||||
* privacy reasons.
|
||||
**/
|
||||
export const removeOriginHeaders: ProxyRequestMiddleware = (proxyReq) => {
|
||||
proxyReq.setHeader("origin", "");
|
||||
proxyReq.setHeader("referer", "");
|
||||
};
|
||||
@@ -99,13 +99,6 @@ function openaiToAnthropic(body: any, req: Request) {
|
||||
throw result.error;
|
||||
}
|
||||
|
||||
// Anthropic has started versioning their API, indicated by an HTTP header
|
||||
// `anthropic-version`. The new June 2023 version is not backwards compatible
|
||||
// with our OpenAI-to-Anthropic transformations so we need to explicitly
|
||||
// request the older version for now. 2023-01-01 will be removed in September.
|
||||
// https://docs.anthropic.com/claude/reference/versioning
|
||||
req.headers["anthropic-version"] = "2023-01-01";
|
||||
|
||||
const { messages, ...rest } = result.data;
|
||||
const prompt =
|
||||
result.data.messages
|
||||
|
||||
+3
-13
@@ -9,13 +9,11 @@ import { ipLimiter } from "./rate-limit";
|
||||
import { handleProxyError } from "./middleware/common";
|
||||
import {
|
||||
addKey,
|
||||
blockZoomerOrigins,
|
||||
createPreprocessorMiddleware,
|
||||
finalizeBody,
|
||||
languageFilter,
|
||||
limitCompletions,
|
||||
limitOutputTokens,
|
||||
removeOriginHeaders,
|
||||
} from "./middleware/request";
|
||||
import {
|
||||
createOnProxyResHandler,
|
||||
@@ -30,19 +28,13 @@ function getModelsResponse() {
|
||||
return modelsCache;
|
||||
}
|
||||
|
||||
// https://platform.openai.com/docs/models/overview
|
||||
const gptVariants = [
|
||||
"gpt-4",
|
||||
"gpt-4-0613",
|
||||
"gpt-4-0314", // EOL 2023-09-13
|
||||
"gpt-4-0314",
|
||||
"gpt-4-32k",
|
||||
"gpt-4-32k-0613",
|
||||
"gpt-4-32k-0314", // EOL 2023-09-13
|
||||
"gpt-4-32k-0314",
|
||||
"gpt-3.5-turbo",
|
||||
"gpt-3.5-turbo-0301", // EOL 2023-09-13
|
||||
"gpt-3.5-turbo-0613",
|
||||
"gpt-3.5-turbo-16k",
|
||||
"gpt-3.5-turbo-16k-0613",
|
||||
"gpt-3.5-turbo-0301",
|
||||
];
|
||||
|
||||
const gpt4Available = keyPool.list().filter((key) => {
|
||||
@@ -95,8 +87,6 @@ const rewriteRequest = (
|
||||
languageFilter,
|
||||
limitOutputTokens,
|
||||
limitCompletions,
|
||||
blockZoomerOrigins,
|
||||
removeOriginHeaders,
|
||||
finalizeBody,
|
||||
];
|
||||
|
||||
|
||||
+2
-2
@@ -197,8 +197,8 @@ async function setBuildInfo() {
|
||||
logger.error(
|
||||
{
|
||||
error,
|
||||
stdout: error.stdout?.toString(),
|
||||
stderr: error.stderr?.toString(),
|
||||
stdout: error.stdout.toString(),
|
||||
stderr: error.stderr.toString(),
|
||||
},
|
||||
"Failed to get commit SHA.",
|
||||
error
|
||||
|
||||
Reference in New Issue
Block a user