Per-user token quotas and automatic quota refreshing (khanon/oai-reverse-proxy!37)
This commit is contained in:
|
Before Width: | Height: | Size: 153 KiB After Width: | Height: | Size: 153 KiB |
|
Before Width: | Height: | Size: 22 KiB After Width: | Height: | Size: 22 KiB |
|
Before Width: | Height: | Size: 36 KiB After Width: | Height: | Size: 36 KiB |
@@ -12,12 +12,12 @@ This repository can be deployed to a [Huggingface Space](https://huggingface.co/
|
||||
- Provide a name for your Space and select "Docker" as the SDK. Select "Blank" for the template.
|
||||
- Click "Create Space" and wait for the Space to be created.
|
||||
|
||||

|
||||

|
||||
|
||||
### 3. Create an empty Dockerfile
|
||||
- Once your Space is created, you'll see an option to "Create the Dockerfile in your browser". Click that link.
|
||||
|
||||

|
||||

|
||||
- Paste the following into the text editor and click "Save".
|
||||
```dockerfile
|
||||
FROM node:18-bullseye-slim
|
||||
@@ -34,7 +34,7 @@ CMD [ "npm", "start" ]
|
||||
```
|
||||
- Click "Commit new file to `main`" to save the Dockerfile.
|
||||
|
||||

|
||||

|
||||
|
||||
### 4. Set your API key as a secret
|
||||
- Click the Settings button in the top right corner of your repository.
|
||||
@@ -82,8 +82,6 @@ MAX_OUTPUT_TOKENS_ANTHROPIC=512
|
||||
# Block prompts containing disallowed characters
|
||||
REJECT_DISALLOWED=false
|
||||
REJECT_MESSAGE="This content violates /aicg/'s acceptable use policy."
|
||||
# Show exact quota usage on the Server Info page
|
||||
QUOTA_DISPLAY_MODE=full
|
||||
```
|
||||
|
||||
See `.env.example` for a full list of available settings, or check `config.ts` for details on what each setting does.
|
||||
|
||||
@@ -11,6 +11,7 @@ Several of these features require you to set secrets in your environment. If us
|
||||
- [Memory](#memory)
|
||||
- [Firebase Realtime Database](#firebase-realtime-database)
|
||||
- [Firebase setup instructions](#firebase-setup-instructions)
|
||||
- [User quotas](#user-quotas)
|
||||
|
||||
## No user management (`GATEKEEPER=none`)
|
||||
|
||||
@@ -59,3 +60,5 @@ To use Firebase Realtime Database to persist user data, set the following enviro
|
||||
8. Set `GATEKEEPER_STORE` to `firebase_rtdb` in your environment if you haven't already.
|
||||
|
||||
The proxy server will attempt to connect to your Firebase Realtime Database at startup and will throw an error if it cannot connect. If you see this error, check that your `FIREBASE_RTDB_URL` and `FIREBASE_KEY` secrets are set correctly.
|
||||
|
||||
# User quotas
|
||||
|
||||
@@ -0,0 +1,36 @@
|
||||
# User Quotas
|
||||
|
||||
When using `user_token` authentication, you can set (model) token quotas for user. These quotas are enforced by the proxy server and are separate from the quotas enforced by OpenAI.
|
||||
|
||||
You can set the default quota via environment variables. Quotas are enforced on a per-model basis, and count both prompt tokens and completion tokens. By default, all quotas are disabled.
|
||||
|
||||
Set the following environment variables to set the default quotas:
|
||||
- `TOKEN_QUOTA_TURBO`
|
||||
- `TOKEN_QUOTA_GPT4`
|
||||
- `TOKEN_QUOTA_CLAUDE`
|
||||
|
||||
Quotas only apply to `normal`-type users; `special`-type users are exempt from quotas. You can change users' types via the REST API.
|
||||
|
||||
**Note that changes to these environment variables will only apply to newly created users.** To modify existing users' quotas, use the REST API or the admin UI.
|
||||
|
||||
## Automatically refreshing quotas
|
||||
|
||||
You can use the `QUOTA_REFRESH_PERIOD` environment variable to automatically refresh users' quotas periodically. This is useful if you want to give users a certain number of tokens per day, for example. The entire quota will be refreshed at the start of the specified period, and any tokens a user has not used will not be carried over.
|
||||
|
||||
Quotas for all models and users will be refreshed. If you haven't set `TOKEN_QUOTA_*` for a particular model, quotas for that model will not be refreshed (so any manually set quotas will not be overwritten).
|
||||
|
||||
Set the `QUOTA_REFRESH_PERIOD` environment variable to one of the following values:
|
||||
- `daily` (at midnight)
|
||||
- `hourly`
|
||||
- leave unset to disable automatic refreshing
|
||||
|
||||
You can also use a cron expression, for example:
|
||||
- Every 45 seconds: `"*/45 * * * * *"`
|
||||
- Every 30 minutes: `"*/30 * * * *"`
|
||||
- Every 6 hours: `"0 */6 * * *"`
|
||||
- Every 3 days: `"0 0 */3 * *"`
|
||||
- Daily, but at mid-day: `"0 12 * * *"`
|
||||
|
||||
Make sure to enclose the cron expression in quotation marks.
|
||||
|
||||
All times are in the server's local time zone. Refer to [crontab.guru](https://crontab.guru/) for more examples.
|
||||
Reference in New Issue
Block a user