- Add gpt-5.1-2025-11-12 (GPT-5.1 Thinking) with adaptive reasoning
- Add gpt-5.1-chat-latest (GPT-5.1 Instant) with adaptive reasoning
- Both models include full feature set: chat, vision, function calling, JSON, prompt caching, reasoning, web search
- Pricing set to match GPT-5 (to be updated when official pricing is announced)
- Added models to manual ordering list for proper UI sorting
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
Add userPricing field to DLLM interface following the established pattern
for user overrides (similar to userContextTokens and userMaxOutputTokens).
This enables users to set custom pricing for local models (Ollama, LM Studio, etc.)
to track "what if" costs and compare with cloud models.
Changes:
- Added userPricing field to DLLM interface (llms.types.ts)
- Added getLLMPricing() getter function with override precedence
- Updated store to preserve userPricing during model updates
- Updated all llm.pricing access points to use getLLMPricing()
- Added pricing override UI in LLMOptionsModal (Details section)
- Input price ($/M tokens)
- Output price ($/M tokens)
- Reset buttons for each field
- Cost calculations automatically use user pricing when set
- Existing cost display in tooltips works with user pricing
Resolves#860
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
This refactor allows for low-level looping on the client side.
This can be used for network errors between server<>upstream reported as particles,
as well as for client<>server connections.
One special case of this is the OpenAI system to reattach to detached (background) requests,
or as an alternative to re-fetch them from the server once completed.
This refactor decomposes the chatGeneration procedure into composable blocks.
Allows for instance chatGeneration-like outputs from different inputs,
allowing for instance `resumability` of a background connection.
Moreover this reorganizes the phases of a CG operation, and includes a generic executor
that takes creator functions for Dispatchers.
- Read recognitionState.errorMessage in Telephone component
- Pass error message to CallStatus component
- Display specific error messages instead of generic fallback
- Matches error handling pattern used in Chat/Composer
This ensures users see detailed error messages instead of generic
Browser may not support text.
Fixes#829 by making speech recognition errors visible to users.
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
Fixes#850 - Model visibility was being reset after app updates.
Root cause: User visibility changes were stored in `hidden` field instead of
`userHidden`, but the preservation logic only looked for `userHidden`. This
caused user preferences to be lost during model updates.
Changes:
- Added isLLMHidden() helper to compute effective visibility (userHidden ?? hidden)
- Fixed all write paths to set userHidden instead of hidden (3 files)
- Fixed all read paths to use isLLMHidden() (7 files, 14 locations)
This ensures:
- User preferences persist across updates
- Vendor visibility changes still propagate for untouched models
- Bulk operations work correctly
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
NOTEs: this works without saving the server-side tool invocation and the subsequent responses
to AIX particles, and consequently to DMessageFragments of the opportune type.
-> Shall do it with execution graph fragments.
Implements comprehensive support for Anthropic's web search (web_search_20250305) and web fetch (web_fetch_20250910) tools.
- Add llmVndAntWebSearch and llmVndAntWebFetch parameters with ['auto', 'off'] options
- Enable tools for Claude 4.5, 4.1, 4, 3.7, 3.5 Sonnet/Haiku/Opus models (including thinking variants)
- Inject web_search_20250305 and web_fetch_20250910 tools based on parameter values
- Configure web search with max_uses=5 for progressive searches
- Configure web fetch with max_uses=5 and citations enabled
- Add dynamic beta header injection for web fetch (web-fetch-2025-09-10)
- Add UI controls in model settings for easy parameter configuration
Parser already supports web_search_tool_result and web_fetch_tool_result blocks (no changes needed).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Enrico Ros <enricoros@users.noreply.github.com>
- Use console.warn for benign removeChild errors
- Skip PostHog reporting for these errors
- More succinct implementation
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
- Added parameter to GPT-5, GPT-5 Pro, GPT-5 Codex, and GPT-5 Mini
- These models require Organization ID verification for streaming
- Kept existing parameter on o3 and o3-pro models
- Did not modify GPT-5 Nano, GPT-5 Chat Latest, GPT-4.1, or GPT-4o models which work fine
Fixes#847
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
- Add nullish coalescing (??) after optional chaining to ensure string return
- Prevents undefined from propagating through the promise chain
- Fixes potential TypeError when calling .startsWith() on undefined
Co-authored-by: Enrico Ros <enricoros@users.noreply.github.com>
Note that display:grid was fitting to contents, but we prefer display:flex (direction:column)
so we had to make the maxWidth property from 700 to adaptive to the screen size.
Azure OpenAI doesn't support the web_search_preview tool, which was causing
"Hosted tool 'web_search_preview' is not supported" errors with GPT-5 models.
## Changes:
- Pass dialect information to aixToOpenAIResponses function
- Skip web_search_preview tool addition when dialect is 'azure'
- Add logging when web search is skipped for Azure
- Document known Azure limitations in implementation guide
## Impact:
- Fixes web browsing errors with Azure GPT-5 models
- Maintains web search functionality for regular OpenAI models
- Provides clear logging for debugging
This is a critical fix for Azure OpenAI compatibility as web search is not
currently supported on Azure's Responses API implementation.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses GitHub issue #828 by fixing URL construction for Azure OpenAI's Responses API
and preventing malformed URLs from client configuration issues.
## Problems Fixed:
1. Host normalization: Prevents malformed URLs when client config includes paths/queries
2. API paradigm support: Properly handles Azure's next-gen v1 Responses API
3. API version consistency: Centralizes version management with env overrides
## Key Changes:
- Normalize Azure host URLs to origin only (strip path/query)
- Prefer server environment variables over client-provided hosts
- Add special handling for Responses API (/openai/v1/responses)
- Support both traditional (deployment-based) and v1 API paradigms
- Add configurable API versions via environment variables
- Include debug logging for API paradigm selection
## New Environment Variables:
- AZURE_API_V1: Enable next-gen v1 API explicitly
- AZURE_RESPONSES_API_VERSION: Control Responses API version
- AZURE_CHAT_API_VERSION: Control Chat Completions API version
- AZURE_DEPLOYMENTS_API_VERSION: Control deployments listing API version
## Testing:
Validated with Azure OpenAI endpoint showing:
- List Deployments: ✅ Works
- Chat Completions: ✅ Works (with correct params for GPT-5)
- Responses API (v1): ✅ Works with /openai/v1/responses?api-version=preview
- Responses API (traditional): ❌ 404 (Azure doesn't support this pattern)
The fix defaults to using Azure's recommended next-gen v1 API for Responses
while maintaining backward compatibility for existing deployments.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Warning: older browsers will ignore the entire CSS lines containing round() calls.
However we already introduced top-level layout rounds in 85e4946f (Fix fractional sizes of drawer and pane).
To restore support of old browsers, calls to 'round()' need to be stripped of the round part.
Background: all of a sudden Sharp started not building anymore with the following error message:
```
./public/images/covers/release-cover-v1.12.0.png
Error: Could not load the "sharp" module using the win32-x64 runtime
Possible solutions:
- Ensure optional dependencies can be installed:
npm install --include=optional sharp
- Ensure your package manager supports multi-platform installation:
See https://sharp.pixelplumbing.com/install#cross-platform
- Add platform-specific dependencies:
npm install --os=win32 --cpu=x64 sharp
- Consult the installation documentation:
See https://sharp.pixelplumbing.com/install
at Object.<anonymous> (PATH\node_modules\sharp\lib\sharp.js:113:9)
at Module._compile (node:internal/modules/cjs/loader:1730:14)
at Object..js (node:internal/modules/cjs/loader:1895:10)
at Module.load (node:internal/modules/cjs/loader:1465:32)
at Function._load (node:internal/modules/cjs/loader:1282:12)
at TracingChannel.traceSync (node:diagnostics_channel:322:14)
at wrapModuleLoad (node:internal/modules/cjs/loader:235:24)
at Module.<anonymous> (node:internal/modules/cjs/loader:1487:12)
at mod.require (PATH\node_modules\next\dist\server\require-hook.js:65:28)
at require (node:internal/modules/helpers:135:16)
```
This is without changing anything in the system nor in the build. May be a faulty env detection, and happens across all branches.
Deploying this and trying it out.
Suddenly all Vercel builds experienced exceptions and connection terminations.
On 2025-05-22 around 8PM CET, Vercel servers started to log errors on tRPC calls.
This fix waits 1 extra event loop tick. Shall work around the issue until a proper fix is found.
A new setting to export all the settings in localstorage and IndexedDB into
single 'flash' files for Big-AGI to reload.
This allows to quickly and easily migrate a full installation, including images,
from a v2-dev open installation to another.
This won't likely work across other branches, but it's meant to be forward-compatible.
Introducing the Events module with per-Domain extensibility.
Depends on @Logger.
Required eventemitter3.
A pleasure to extend, and start using both for Subsystems and AGI events.
When this property is set, a re-layout (force reflow) is performed by the browser even with a simple hovering of the separator.
Since we may have very large walls of text/images, we need to minimize relayouts, so for now, we set a min size on the contained scrollable container instead of preventing the resize.
See also this upstream issues: https://github.com/bvaughn/react-resizable-panels/issues/337
To turn it on, either|or:
- server side: aix.router.ts: DEBUG_LOG_PROFILER_ON_SERVER=true
- client side: DEV BUILD + "debug mode" + DEBUG_LOG_PROFILER_ON_CLIENT=true to show on the console
This requires EITHER:
- on the server-side, in aix.router.ts, set DEBUG_LOG_PROFILER=true;
- on the client side, and only for Development builds, this is automatic in "Debug Mode"
What's left:
- figure out how to turn this on/off
- figure out which models can or cannot use it, without having too much to maintain
- figure out the runtime implementation
- parse the annotation ranges
- render the original icons
- figure out how to escape the Vertex rewriting of URLs
Somehow the developer instruction is not enabled for Gemma3-IT, and we got this message:
"Gemini: Bad Request - Developer instruction is not enabled for models/gemma-3-27b-it"
So we convert any System message to a User message instead (see the hotfix)
Note: the ModelAux/reasoning block is only sent if there's a signature or there is redacted data.
We could even further reduce its sending to only Anthropic llms in CGR.
Note: if there aren't enough tokens in the chat, the Anthropic API will throw. Nothing to do there for now.
We will wait for Anthropic to step in and fix the issue before fixing something on our end that's clearly not our issue.
Verified on: Azure, Groq, LMStudio, LocalAI, Mistral, OpenAI, TogetherAI
FC not supported on: Deepseek, Perplexity
FC poorly supported on: OpenRouter (basically only in OpenAI it works)
Corrected the typo from "proyy" to "proxy" in the file `config-feature-browse.md`. This change addresses a small, but significant error in the configuration documentation.
This introduces a pre-build step on Next Build, which hides the files
in the app/api directory when the EXPORT_FRONTEND environment
variable is true-ish.
Hopefully there won't be disruption due to the post-processing step.
Also check https://github.com/vercel/next.js/issues/61213 for
upstream updates.
This is a good feature. The 'Delete All' will always operate on the current selection.
If a Folder restricts to 10 conversasions, and a search narrows it down to 3,
then the 3 will be deleted. This works really
well for quick cleanups.
description: Sync Anthropic API implementation with latest upstream documentation
argument-hint: specific feature to check
---
Please take a look at my API code for Anthropic: message wire types `src/modules/aix/server/dispatch/wiretypes/anthropic.wiretypes.ts`, assembly of the request messages (adapters) `src/modules/aix/server/dispatch/chatGenerate/adapters/anthropic.messageCreate.ts`, and parsing of the response in streaming or not `src/modules/aix/server/dispatch/chatGenerate/parsers/anthropic.parser.ts`.
IMPORTANT: we only support the Messages API (message create). We do NOT support other APIs such as the older Completions API.
We support Anthropic caching natively, and want to make sure tools and state (crafting the history) are also done well.
Then take a look at the newest API information available. Try these sources, and be creative if some are blocked:
- Recent news and announcements: Web Search for "anthropic api changelog" or "new claude api" or "new claude api pricing"
**If all blocked:** Explain what you attempted and ask user to provide documentation manually.
$ARGUMENTS
Check carefully and look if there are any discrepancies in the protocols, the available API surface, the structure of the messages, functionality, logic, etc.
Make sure you look deep in the fields of the requests and responses, especially required fields, streaming event types, and any new response shapes.
Please point out all of the differences in the API whether it's in the final parsing and reassembly of the streaming message, or the protocol changed, etc.
Prioritize breaking changes and new capabilities that would improve the user experience.
description: Sync Google Gemini API implementation with latest upstream documentation
argument-hint: specific feature to check
---
Please take a look at my API code for Google Gemini: message wire types `src/modules/aix/server/dispatch/wiretypes/gemini.wiretypes.ts`, assembly of the request messages (adapters) `src/modules/aix/server/dispatch/chatGenerate/adapters/gemini.generateContent.ts`, and parsing of the response in streaming or not `src/modules/aix/server/dispatch/chatGenerate/parsers/gemini.parser.ts`.
IMPORTANT: we only support the generateContent API, not other Gemini APIs such as embeddings, etc.
Caching is only supported when implicit, we do not explicitly manage Gemini Caches. Same for file uploads and other systems.
Image generation happens through models, i.e. 'Gemini 2.5 Flash - Nano Banana' generates images using AIX from generateContent (chat input).
Then take a look at the newest API information available. Try these sources, and be creative if some are blocked:
**Primary Sources:**
- Docs API 1/2: https://ai.google.dev/api/generate-content
- Docs API 2/2: https://ai.google.dev/api/caching#Content
- Google AI JavaScript SDK: https://github.com/googleapis/js-genai (check latest commits, README, type definitions)
Recent news and announcements: Web Search for "gemini api changelog" or "nwe gemini api updates" or "new gemini api pricing"
**If all blocked:** Explain what you attempted and ask user to provide documentation manually.
$ARGUMENTS
Check carefully and look if there are any discrepancies in the protocols, the available API surface, the structure of the messages, functionality, logic, etc.
Make sure you look deep in the fields of the requests and responses, especially required fields, streaming event types, and any new response shapes.
Please point out all of the differences in the API whether it's in the final parsing and reassembly of the streaming message, or the protocol changed, etc.
Prioritize breaking changes and new capabilities that would improve the user experience.
description: Sync OpenAI API implementation with latest upstream documentation
argument-hint: specific feature to check
---
Please take a look at my API code for OpenAI: message wire types `src/modules/aix/server/dispatch/wiretypes/openai.wiretypes.ts`, assembly of the request messages (adapters) `src/modules/aix/server/dispatch/chatGenerate/adapters/openai.chatCompletions.ts`, and parsing of the response in streaming or not `src/modules/aix/server/dispatch/chatGenerate/parsers/openai.parser.ts`.
IMPORTANT: we prioritize the new Responses API, while Chat Completions is still supported but legacy.
We do NOT support other APIs such as Realtime (incl. websockets), etc.
We also do not support Agentic APIs (Agent SDK, AgentKit, ChatKit, Assistants API etc), as we perform similar functionality in AIX (server or client side).
Then take a look at the newest API information available. Try these sources, and be creative if some are blocked:
**Primary Sources:**
- Responses API (AIX prioritizes it): https://platform.openai.com/docs/api-reference/responses/create
Recent news and announcements: Web Search for "openai api changelog" or "openai new models" or "openai new prices"
**If all blocked:** Explain what you attempted and ask user to provide documentation manually.
$ARGUMENTS
Check carefully and look if there are any discrepancies in the protocols, the available API surface, the structure of the messages, functionality, logic, etc.
Make sure you look deep in the fields of the requests and responses, especially required fields, streaming event types, and any new response shapes.
Please point out all of the differences in the API whether it's in the final parsing and reassembly of the streaming message, or the protocol changed, etc.
Prioritize breaking changes and new capabilities that would improve the user experience.
GOAL: Ensure complete support for OpenRouter's API including advanced features like reasoning/thinking tokens, tool use, search integration, and multi-modal capabilities. OpenRouter is OpenAI-compatible but has important extensions and differences.
Use Task tool with subagent_type=Explore and thoroughness="very thorough" to discover:
1. Map API structure - all endpoints, parameters, capabilities from https://openrouter.ai/docs
2.**Advanced features** - How to use: reasoning/thinking tokens (o1, DeepSeek R1), tool use/function calling, search integration, multi-modal (vision/audio)
3. Changelog location - How does OpenRouter communicate API updates and breaking changes?
4. Model metadata - What capabilities are exposed in the models list API? How to detect feature support?
description: Update Alibaba model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/alibaba.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
- Search "alibaba model studio latest pricing", "alibaba latest models", "qwen models pricing", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update Anthropic model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/anthropic/anthropic.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Fallbacks if blocked:** Check Anthropic TypeScript SDK at https://github.com/anthropics/anthropic-sdk-typescript, search "anthropic models latest pricing", "anthropic latest models", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update DeepSeek model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/deepseek.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
- Model List: https://api-docs.deepseek.com/api/list-models
- Release Notes: https://api-docs.deepseek.com/updates (check for version updates like V3.2-Exp)
**Note:** DeepSeek frequently releases new versions with significant pricing changes. Always check release notes first.
**Fallbacks if blocked:** Search "deepseek api latest pricing", "deepseek latest models", "deepseek models list" or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update Gemini model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/gemini/gemini.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.types.ts`, `src/modules/llms/server/llm.server.types.ts`, and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Fallbacks if blocked:** Check Google AI JS SDK at https://github.com/googleapis/js-genai, search "gemini models latest pricing", "gemini latest models", or search GitHub for latest model prices and context windows
**Important:**
- Ignore context windows (auto-determined at runtime) and training cutoffs (not supported)
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
- Preserve comments to make diffs easy to review, do NOT remove comments
description: Update Groq model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/groq.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Primary Sources:**
- Models: https://console.groq.com/docs/models
- Pricing: https://groq.com/pricing/
**Fallbacks if blocked:** Search "groq models latest pricing", "groq latest models", "groq api models", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update Kimi model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/moonshot.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
- API Reference: https://platform.moonshot.ai/docs/api/chat
**Fallbacks if blocked:** Search "moonshot kimi models latest pricing", "kimi k2 models", "moonshot api models", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update Mistral model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/mistral.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
- Search "mistral [model-name] latest pricing", "mistral api latest pricing", "mistral latest models", or search GitHub for latest model prices and context windows
description: Update Ollama model definitions with latest featured models
---
Update `src/modules/llms/server/ollama/ollama.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Fallbacks if blocked:** Check https://github.com/ollama/ollama, search "ollama featured models", "ollama latest models", or search GitHub for latest model info
**Important:**
- Skip models below 50,000 pulls (parser does this automatically)
- Skip embedding models (parser does not do this automatically)
- Sort them in the EXACT same order as the source (featured models)
- Extract tags: 'tools' → hasTools, 'vision' → hasVision, 'embedding' → isEmbeddings (note the 's'), 'thinking' → tags only
- Extract 'b' tags (1.5b, 7b, 32b) to tags field
- Set today's date (YYYYMMDD format) for newly added models only
- Update OLLAMA_LAST_UPDATE constant to today's date
- Do NOT change dates of existing models
- Review the full model list for additions, removals, and changes
- Minimize whitespace/comment changes, focus on content
- Preserve comments and newlines to make diffs easy to review
description: Update OpenAI model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/openai.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Manual hint:** For pricing page, expand all tables before copying content.
- Search "openai models latest pricing", "openai latest models" for third-party aggregators, or search GitHub for latest model prices and context windows
- OpenAI Node SDK (https://github.com/openai/openai-node) has limited model metadata only
- As last resort: Use Chrome DevTools MCP to navigate and extract from official docs
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update OpenPipe model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/openpipe.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Primary Sources:**
- Base Models: https://docs.openpipe.ai/base-models
**Fallbacks if blocked:** Search "openpipe models latest pricing", "openpipe latest models", "openpipe base models", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update Perplexity model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/perplexity.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
**Fallbacks if blocked:** Search "perplexity api latest pricing", "perplexity latest models", or search GitHub for latest model prices and context windows
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
description: Update xAI model definitions with latest pricing and capabilities
---
Update `src/modules/llms/server/openai/models/xai.models.ts` with latest model definitions.
Reference `src/modules/llms/server/llm.server.types.ts` and `src/modules/llms/server/models.mappings.ts` for context only. Focus on the model file, do not descend into other code.
- Search "xai grok latest pricing", "xai latest models", "xai api models", or search GitHub for latest model prices and context windows
- Random sites? https://the-rogue-marketing.github.io/grok-api-latest-llms-pricing-october-2025/ (find a newer version), https://langdb.ai/app/providers/xai/ (browse by model, limited coverage)
- As last resort: Use Chrome DevTools MCP to access docs.x.ai
**Important:**
- Review the full model list for additions, removals, and price changes
- Minimize whitespace/comment changes, focus on content
- [ ] Create a temporary tag `git tag v1.2.3 && git push opensource --tags`
- [ ] Create a [New Draft GitHub Release](https://github.com/enricoros/big-agi/releases/new), and generate the automated changelog (for new contributors)
- [ ] Update the release version in package.json, and `npm i`
- [ ] Copy the highlights to the [docs/changelog.md](/docs/changelog.md)
- Release:
@@ -31,7 +32,6 @@ assignees: enricoros
- [ ] verify deployment on Vercel
- [ ] verify container on GitHub Packages
- [ ] update the GitHub release
- [ ] push as stable `git push opensource main:main-stable`
- Announce:
- [ ] Discord announcement
- [ ] Twitter announcement
@@ -79,11 +79,32 @@ I need the following from you:
1. a table summarizing all the new features in 1.2.3 with the following columns: 4 words description (exactly what it is), short description, usefulness (what it does for the user), significance, link to the issue number (not the commit)), which will be used for the artifacts later
2. then double-check the git log to see if there are any features of significance that are not in the table
3. then score each feature in terms of importance for users (1-10), relative impact of the feature (1-10, where 10 applies to the broadest user base), and novelty and uniqueness (1-10, where 10 is truly unique and novel from what exists already)
3. then score each feature in terms of importance for users (1-10), relative impact of the feature (1-10, where 10 applies to the broadest user base), and novelty and uniqueness (1-10, where 10 is truly unique and novel from what exists already)
4. then improve the table, in decreasing order of importance for features, fixing any detail that's missing, in particular check if there are commits of significance from a user or developer point of view, which are not contained in the table
5. then I want you then to update the news.data.tsx for the new release
```
### release name
```markdown
please brainstorm 10 different names for this release. see the former names here: https://big-agi.com/blog
```
You can follow with 'What do you think of Modelmorphic?' or other selected name
### cover images
```markdown
Great, now I need to generate images for this. Before I used the following prompts (2 releases before).
// An image of a capybara sculpted entirely from black cotton candy, set against a minimalist backdrop with splashes of bright, contrasting sparkles. The capybara is using a computer with split screen made of origami, split keyboard and is wearing origami sunglasses with very different split reflections. Split halves are very contrasting. Close up photography, bokeh, white background.
import coverV113 from '../../../public/images/covers/release-cover-v1.13.0.png';
// An image of a capybara sculpted entirely from black cotton candy, set against a minimalist backdrop with splashes of bright, contrasting sparkles. The capybara is calling on a 3D origami old-school pink telephone and the camera is zooming on the telephone. Close up photography, bokeh, white background.
import coverV112 from '../../../public/images/covers/release-cover-v1.12.0.png';
What can I do now as far as images? Give me 4 prompt ideas with the same style as looks as the former, but different scene or action
org.opencontainers.image.description=Big-AGI Open - Multi-model AI workspace for experts who need to think broader, decide smarter, and build with confidence.
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Commands
```bash
# Targeted Code Quality (safe while dev server runs)
npx tsc --noEmit # Type check without building
npx eslint src/path/to/file.ts # Lint specific file
npm run lint # Lint entire project
```
## Architecture Overview
Big-AGI is a Next.js 15 application with a modular architecture built for advanced AI interactions. The codebase follows a three-layer structure with distinct separation of concerns.
[](https://big-agi.com)
[](https://github.com/enricoros/big-AGI/pkgs/container/big-agi)
[](https://vercel.com/new/clone?repository-url=https://github.com/enricoros/big-agi)
[](https://github.com/enricoros/big-agi/issues/new?template=ai-triage.yml)
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
[//]: # ([](https://stats.uptimerobot.com/59MXcnmjrM))
[//]: # ([](https://x.com/enricoros))
**Intelligence**: with [Beam & Merge](https://big-agi.com/beam) for multi-model de-hallucination, native search, and bleeding-edge AI models like Opus 4.5, Nano Banana, Kimi K2 or GPT 5.1 -
**Control**: with personas, data ownership, requests inspection, unlimited usage with API keys, and *no vendor lock-in* -
and **Speed**: with a local-first, over-powered, zero-latency, madly optimized web app.
Choose Big-AGI because you don't need another clone or slop - you need an AI tool that scales with you.
### Show me a screenshot:
Sure - here is real-world screeengrab as I'm writing this, while running a Beam to extract SVG from an image with Sonnet 4.5, Opus 4.1, GPT 5.1, Gemini 2.5 Pro, Nano Banana, etc.
<img alt="Real-world screen capture as of Nov 15 2025, 2am" src="https://github.com/user-attachments/assets/853f4160-27cb-4ac9-826b-402f1e63d4af" />
\*: **Configuration requires your API keys**. *Big-AGI does not charge for model usage or limit your access*.
**Why Pro?** As an independent project, Pro subscriptions fund all development. Early subscribers shape the roadmap directly.
[](https://big-agi.com)
**Self-host and developers** (full control)
- Develop locally or self-host with Docker on your own infrastructure – [guide](docs/installation.md)
- Or fork & run on Vercel:
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-AGI)
[//]: # (**For the latest Big-AGI:**)
[//]: # (- [**Big-AGI Open**](https://github.com/enricoros/big-AGI/tree/main) - Open Source, latest models and features (main branch))
[//]: # (- [**Big-AGI Pro**](https://big-agi.com) - Hosted with Cloud Sync)
---
## Our Philosophy
We're an independent, non-VC-funded project with a simple belief: **AI should elevate you, not replace you**.
This is why we built Big-AGI to be **local-first**, madly optimized to 0-latency, launched multi-model first to
defeat hallucinations, designed Beam around the **humans in the loop**, re-wrote frameworks and abstractions
so you **are not vendor locked-in**, and obsessed over a powerful UI that works, just works.
NOTE: this is a powerful tool - if you need a toy UI or clone, this ain't it.
---
## Release Notes
👉 **[See the Live Release Notes](https://big-agi.com/changes)**
**For teams and institutions:** Need shared prompts, SSO, or managed deployments? Reach out at enrico@big-agi.com. We're actively collecting requirements from research groups and IT departments.
<details>
<summary>5,000 Commits Milestone</summary>
Hit 5k commits last week. That's a lot of code.
Recent work has been intense:
- Chain of thought reasoning across multiple LLMs: **OpenAI o3** and o1, **DeepSeek R1**, **Gemini 2.0 Flash Thinking**, and more
- Beam is real - ~35% of our users run it daily to compare models
- New AIX framework lets us scale features we couldn't before
- UI is faster than ever. Like, terminal-fast
The new architecture is solid and the speed improvements are real.
<summary>What's New in 1.16.1...1.16.10 · 2024-2025 (patch releases)</summary>
- 1.16.10: OpenRouter models support
- 1.16.9: Docker Gemini fix, R1 models support
- 1.16.8: OpenAI ChatGPT-4o Latest, o1 models support
- 1.16.7: OpenAI support for GPT-4o 2024-08-06
- 1.16.6: Groq support for Llama 3.1 models
- 1.16.5: GPT-4o Mini support
- 1.16.4: 8192 tokens support for Claude 3.5 Sonnet
- 1.16.3: Anthropic Claude 3.5 Sonnet model support
- 1.16.2: Improve web downloads, as text, markdown, or HTML
- 1.16.2: Proper support for Gemini models
- 1.16.2: Added the latest Mistral model
- 1.16.2: Tokenizer support for gpt-4o
- 1.16.2: Updates to Beam
- 1.16.1: Support for the new OpenAI GPT-4o 2024-05-13 model
</details>
<details>
<summary>What's New in 1.16.0 · May 9, 2024 · Crystal Clear</summary>
- [Beam](https://big-agi.com/blog/beam-multi-model-ai-reasoning) core and UX improvements based on user feedback
- Chat cost estimation 💰 (enable it in Labs / hover the token counter)
- Save/load chat files with Ctrl+S / Ctrl+O on desktop
- Major enhancements to the Auto-Diagrams tool
- YouTube Transcriber Persona for chatting with video content, [#500](https://github.com/enricoros/big-AGI/pull/500)
- Improved formula rendering (LaTeX), and dark-mode diagrams, [#508](https://github.com/enricoros/big-AGI/issues/508), [#520](https://github.com/enricoros/big-AGI/issues/520)
- Code soft-wrap, chat text selection toolbar, 3x faster on Apple silicon, and more [#517](https://github.com/enricoros/big-AGI/issues/517), [507](https://github.com/enricoros/big-AGI/pull/507)
</details>
<details>
<summary>3,000 Commits Milestone · April 7, 2024</summary>
- 🥇 Today we <b>celebrate commit 3000</b> in just over one year, and going stronger 🚀
- 📢️ Thanks everyone for your support and words of love for Big-AGI, we are committed to creating the best AI experiences for everyone.
</details>
<details>
<summary>What's New in 1.15.0 · April 1, 2024 · Beam</summary>
- ⚠️ [**Beam**: the multi-model AI chat](https://big-agi.com/blog/beam-multi-model-ai-reasoning). find better answers, faster - a game-changer for brainstorming, decision-making, and creativity. [#443](https://github.com/enricoros/big-AGI/issues/443)
- Managed Deployments **Auto-Configuration**: simplify the UI models setup with backend-set models. [#436](https://github.com/enricoros/big-AGI/issues/436)
- Message **Starring ⭐**: star important messages within chats, to attach them later. [#476](https://github.com/enricoros/big-AGI/issues/476)
- Enhanced the default Persona
- Fixes to Gemini models and SVGs, improvements to UI and icons
- 1.15.1: Support for Gemini Pro 1.5 and OpenAI Turbo models
- Beast release, over 430 commits, 10,000+ lines changed: [release notes](https://github.com/enricoros/big-AGI/releases/tag/v1.15.0), and changes [v1.14.1...v1.15.0](https://github.com/enricoros/big-AGI/compare/v1.14.1...v1.15.0)
</details>
<details>
<summary>What's New in 1.14.1 · March 7, 2024 · Modelmorphic</summary>
- **Anthropic** [Claude-3](https://www.anthropic.com/news/claude-3-family) model family support. [#443](https://github.com/enricoros/big-AGI/issues/443)
- New **[Perplexity](https://www.perplexity.ai/)** and **[Groq](https://groq.com/)** integration (thanks @Penagwin). [#407](https://github.com/enricoros/big-AGI/issues/407), [#427](https://github.com/enricoros/big-AGI/issues/427)
- **[LocalAI](https://localai.io/models/)** deep integration, including support for [model galleries](https://github.com/enricoros/big-AGI/issues/411)
- **Mistral** Large and Google **Gemini 1.5** support
- **Side-by-Side Split Windows**: multitask with parallel conversations. [#208](https://github.com/enricoros/big-AGI/issues/208)
- **Multi-Chat Mode**: message everyone, all at once. [#388](https://github.com/enricoros/big-AGI/issues/388)
- **Export tables as CSV** - big thanks to @aj47. [#392](https://github.com/enricoros/big-AGI/pull/392)
-**Adjustable Text Size**: enjoy denser chats. [#399](https://github.com/enricoros/big-AGI/issues/399)
- **Export tables as CSV**: big thanks to @aj47. [#392](https://github.com/enricoros/big-AGI/pull/392)
- Adjustable text size: customize density. [#399](https://github.com/enricoros/big-AGI/issues/399)
- Dev2 Persona Technology Preview
- Better looking chats with improved spacing, fonts, and menus
- More: new video player, [LM Studio tutorial](https://github.com/enricoros/big-AGI/blob/main/docs/config-lmstudio.md), [MongoDB support](https://github.com/enricoros/big-AGI/blob/main/docs/config-database.md) (thanks @ranfysvalle02), and speedups
- More: new video player, [LM Studio tutorial](https://github.com/enricoros/big-AGI/blob/main/docs/config-local-lmstudio.md) (thanks @aj47), [MongoDB support](https://github.com/enricoros/big-AGI/blob/main/docs/deploy-database.md) (thanks @ranfysvalle02), and speedups
### What's New in 1.12.0 · Jan 26, 2024 · AGI Hotline
</details>
<details>
<summary>What's New in 1.12.0 · Jan 26, 2024 · AGI Hotline</summary>
When you open an issue, our custom AI triage system (powered by [Claude Code](https://github.com/anthropics/claude-code-action) with Big-AGI architecture documentation) analyzes it, searches the codebase, and provides solutions - typically within 30 minutes. We've trained the system on our modules and subsystems so it handles most issues effectively. Your feedback drives development!
Clone this repo, install the dependencies (all locally), and run the development server (which auto-watches the
files for changes):
[](https://github.com/enricoros/big-agi/issues/new?template=ai-triage.yml)
[](https://github.com/users/enricoros/projects/4/views/4)
The _production_ build of the application is optimized for performance and is performed by the `npm run build` command,
after installing the required dependencies.
---
```bash
# .. repeat the steps above up to `npm install`, then:
npm run build
next start --port 3000
```
## License
The app will be running on the specified port, e.g. `http://localhost:3000`.
MIT License · [Third-Party Notices](src/modules/3rdparty/THIRD_PARTY_NOTICES.md)
Want to deploy with username/password? See the [Authentication](docs/deploy-authentication.md) guide.
## 🐳 Deploy with Docker
For more detailed information on deploying with Docker, please refer to the [docker deployment documentation](docs/deploy-docker.md).
Build and run:
```bash
docker build -t big-agi .
docker run -d -p 3000:3000 big-agi
```
Or run the official container:
- manually: `docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi`
- or, with docker-compose: `docker-compose up` or see [the documentation](docs/deploy-docker.md) for a composer file with integrated browsing
## ☁️ Deploy on Cloudflare Pages
Please refer to the [Cloudflare deployment documentation](docs/deploy-cloudflare.md).
## 🚀 Deploy on Vercel
Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-agi&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-agi)
## Integrations:
* Local models: Ollama, Oobabooga, LocalAi, etc.
* [ElevenLabs](https://elevenlabs.io/) Voice Synthesis (bring your own voice too) - Settings > Text To Speech
| | FC options structure | JSON Schema for input | Object with properties | JSON Schema for parameters |
| | FC Force invocation | Via `tool_choice` parameter | Via `toolConfig` parameter | Via `tool_choice` parameter |
| | FC Model invocation | Model generates a `tool_use` block with predicted parameters | Generates a `functionCall` part with predicted parameters | Generates a message.`tool_calls` item with predicted arguments |
| | FC Result injection | Client appends a `user` message with a `tool_result` content block | Client appends a `function` message with `functionResponse` part | Client sends a new `tool` message with `tool_call_id` and `content` |
| | Built-in Code execution | No | **Yes** | No |
| | Tool use with vision | Yes | Yes | Yes |
| **Generation Configuration** |
| | temperature | Yes | Yes | Yes |
| | max_tokens | Yes | Yes | Yes |
| | stop_sequences | Yes | Yes | Yes |
| | top_k | Yes | Yes | **No** |
| | top_p | Yes | Yes | Yes |
| | seed | No | No | **Yes** |
| | Multiple candidates | No | No | Yes (with 'n' parameter, breaks streaming?) |
- work in progress: [big-AGI open roadmap](https://github.com/users/enricoros/projects/4/views/2), [help here](https://github.com/users/enricoros/projects/4/views/4)
### What's New in 2 · Oct 31, 2025 · Open
## What's New in 1.13.0 · Feb 8, 2024 · Multi + Mind
- **Big-AGI Open** is ready and more productive and faster than ever, with:
- **Beam 2**: multi-modal, program-based, follow-ups, save presets
- Top-notch AI models support including **agentic models** and **reasoning models**
- **Image Generation** and editing with Nano Banana and gpt-image-1
- **Web Search** with citations for supported models
- **UI** & Mobile UI overhaul with peeking and side panels
- And all of the [Big-AGI 2 changes](https://github.com/enricoros/big-AGI/issues/567#issuecomment-2262187617) and more
- Built for the future, madly optimized
### What's New in 1.16.1...1.16.9 · Jan 21, 2025 (patch releases)
- 1.16.10: OpenRouter models support
- 1.16.9: Docker Gemini fix, R1 models support
- 1.16.8: OpenAI ChatGPT-4o Latest, o1 models support
- 1.16.7: OpenAI support for GPT-4o 2024-08-06
- 1.16.6: Groq support for Llama 3.1 models
- 1.16.5: GPT-4o Mini support
- 1.16.4: 8192 tokens support for Claude 3.5 Sonnet
- 1.16.3: Anthropic Claude 3.5 Sonnet model support
- 1.16.2: Improve web downloads, as text, markdown, or HTML
- 1.16.2: Proper support for Gemini models
- 1.16.2: Added the latest Mistral model
- 1.16.2: Tokenizer support for gpt-4o
- 1.16.2: Updates to Beam
- 1.16.1: Support for the new OpenAI GPT-4o 2024-05-13 model
### What's New in 1.16.0 · May 9, 2024 · Crystal Clear
- [Beam](https://big-agi.com/blog/beam-multi-model-ai-reasoning) core and UX improvements based on user feedback
- Chat cost estimation 💰 (enable it in Labs / hover the token counter)
- Save/load chat files with Ctrl+S / Ctrl+O on desktop
- Major enhancements to the Auto-Diagrams tool
- YouTube Transcriber Persona for chatting with video content, [#500](https://github.com/enricoros/big-AGI/pull/500)
- Improved formula rendering (LaTeX), and dark-mode diagrams, [#508](https://github.com/enricoros/big-AGI/issues/508), [#520](https://github.com/enricoros/big-AGI/issues/520)
- Code soft-wrap, chat text selection toolbar, 3x faster on Apple silicon, and more [#517](https://github.com/enricoros/big-AGI/issues/517), [507](https://github.com/enricoros/big-AGI/pull/507)
- Developers: update the LLMs data structures
### What's New in 1.15.1 · April 10, 2024 (minor release, models support)
- Support for the newly released Gemini Pro 1.5 models
- Support for the new OpenAI 2024-04-09 Turbo models
- Resilience fixes after the large success of 1.15.0
### What's New in 1.15.0 · April 1, 2024 · Beam
- ⚠️ [**Beam**: the multi-model AI chat](https://big-agi.com/blog/beam-multi-model-ai-reasoning). find better answers, faster - a game-changer for brainstorming, decision-making, and creativity. [#443](https://github.com/enricoros/big-AGI/issues/443)
- Managed Deployments **Auto-Configuration**: simplify the UI models setup with backend-set models. [#436](https://github.com/enricoros/big-AGI/issues/436)
- Message **Starring ⭐**: star important messages within chats, to attach them later. [#476](https://github.com/enricoros/big-AGI/issues/476)
- Enhanced the default Persona
- Fixes to Gemini models and SVGs, improvements to UI and icons
- Beast release, over 430 commits, 10,000+ lines changed: [release notes](https://github.com/enricoros/big-AGI/releases/tag/v1.15.0), and changes [v1.14.1...v1.15.0](https://github.com/enricoros/big-AGI/compare/v1.14.1...v1.15.0)
### What's New in 1.14.1 · March 7, 2024 · Modelmorphic
- **Anthropic** [Claude-3](https://www.anthropic.com/news/claude-3-family) model family support. [#443](https://github.com/enricoros/big-AGI/issues/443)
- New **[Perplexity](https://www.perplexity.ai/)** and **[Groq](https://groq.com/)** integration (thanks @Penagwin). [#407](https://github.com/enricoros/big-AGI/issues/407), [#427](https://github.com/enricoros/big-AGI/issues/427)
- **[LocalAI](https://localai.io/models/)** deep integration, including support for [model galleries](https://github.com/enricoros/big-AGI/issues/411)
- **Mistral** Large and Google **Gemini 1.5** support
- **Side-by-Side Split Windows**: multitask with parallel conversations. [#208](https://github.com/enricoros/big-AGI/issues/208)
- **Multi-Chat Mode**: message everyone, all at once. [#388](https://github.com/enricoros/big-AGI/issues/388)
- **Export tables as CSV** - big thanks to @aj47. [#392](https://github.com/enricoros/big-AGI/pull/392)
-**Adjustable Text Size**: enjoy denser chats. [#399](https://github.com/enricoros/big-AGI/issues/399)
- **Export tables as CSV**: big thanks to @aj47. [#392](https://github.com/enricoros/big-AGI/pull/392)
- Adjustable text size: customize density. [#399](https://github.com/enricoros/big-AGI/issues/399)
- Dev2 Persona Technology Preview
- Better looking chats with improved spacing, fonts, and menus
- More: new video player, [LM Studio tutorial](https://github.com/enricoros/big-AGI/blob/main/docs/config-lmstudio.md), [MongoDB support](https://github.com/enricoros/big-AGI/blob/main/docs/config-database.md) (thanks @ranfysvalle02), and speedups
- More: new video player, [LM Studio tutorial](https://github.com/enricoros/big-AGI/blob/main/docs/config-local-lmstudio.md) (thanks @aj47), [MongoDB support](https://github.com/enricoros/big-AGI/blob/main/docs/deploy-database.md) (thanks @ranfysvalle02), and speedups
## What's New in 1.12.0 · Jan 26, 2024 · AGI Hotline
### What's New in 1.12.0 · Jan 26, 2024 · AGI Hotline
- **Attachments System Overhaul**: Drag, paste, link, snap, text, images, PDFs and more. [#251](https://github.com/enricoros/big-agi/issues/251)
- **Desktop Webcam Capture**: Image capture now available as Labs feature. [#253](https://github.com/enricoros/big-agi/issues/253)
- **Independent Browsing**: Full browsing support with Browserless. [Learn More](https://github.com/enricoros/big-agi/blob/main/docs/config-browse.md)
- **Independent Browsing**: Full browsing support with Browserless. [Learn More](https://github.com/enricoros/big-agi/blob/main/docs/config-feature-browse.md)
- **Overheat LLMs**: Push the creativity with higher LLM temperatures. [#256](https://github.com/enricoros/big-agi/issues/256)
- **Model Options Shortcut**: Quick adjust with `Ctrl+Shift+O`
- Optimized Voice Input and Performance
- Latest Ollama and Oobabooga models
- Latest Ollama models
- For developers: **Password Protection**: HTTP Basic Auth. [Learn How](https://github.com/enricoros/big-agi/blob/main/docs/deploy-authentication.md)
### What's New in 1.6.0 - Nov 28, 2023 · Surf's Up
- **Web Browsing**: Download web pages within chats - [browsing guide](https://github.com/enricoros/big-agi/blob/main/docs/config-browse.md)
- **Web Browsing**: Download web pages within chats - [browsing guide](https://github.com/enricoros/big-agi/blob/main/docs/config-feature-browse.md)
- **Branching Discussions**: Create new conversations from any message
- **Keyboard Navigation**: Swift chat navigation with new shortcuts (e.g. ctrl+alt+left/right)
- **Performance Boost**: Faster rendering for a smoother experience
This enables Azure OpenAI for all users without requiring individual API keys. For more details, see [environment-variables.md](environment-variables.md).
## Azure OpenAI API Versions
Azure OpenAI supports both traditional deployment-based API and the next-generation v1 API:
### Next-Generation v1 API (Default)
- **Enabled by default** for GPT-5-like models (GPT-5, GPT-6, o3, o4, etc.)
- Uses direct `/openai/v1/responses` endpoint without deployment IDs
- Optimized for advanced reasoning models and new features
- Can be disabled by setting `AZURE_OPENAI_DISABLE_V1=true`
- Fill in the required fields (including the subscription ID) and click on **Apply**
Once your application is accepted, you can create OpenAI resources on Azure.
### Step 3: Create Azure OpenAI Resource
### Step 2: Create Azure OpenAI Resource
For more information, see [Azure: Create and deploy OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal)
@@ -52,31 +74,32 @@ For more information, see [Azure: Create and deploy OpenAI](https://learn.micros

- Select the subscription
- Select a resource group or create a new one
- Select the region. Note that the region determines the available models.
> For instance, **Canada East** offers GPT-4-32k models, For the full list, see [GPT-4 models](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models)
- Select the region. **Important**: The region determines which models are available.
> Popular regions like **East US**, **West Europe**, and **Australia East** typically have the best model availability. For the latest model availability by region, see [Azure OpenAI Model Availability](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models)
- Name the service (e.g., `your-openai-api-1234`)
- Select a pricing tier (e.g., `S0` for standard)
- Select: "All networks, including the internet, can access this resource."
- Click on **Review + create** and then **Create**
After creating the resource, you can access the API Keys and Endpoints. At any point, you can go to
the OpenAI Service instance page to get this information.
After creating the resource, you can access the API Keys and Endpoints:
- Click on **Go to resource**
- Click on **Develop**
- Copy the `Endpoint`, called "Language API", e.g. 'https://your-openai-api-1234.openai.azure.com/'
-Copy `KEY 1`
1. Click on **Go to resource** (or navigate to your Azure OpenAI resource)
2. In the left sidebar, under **Resource Management**, click on **Keys and Endpoint**
Follow the guide at: https://localai.io/basics/getting_started/
For instance with [Use luna-ai-llama2 with docker compose](https://localai.io/basics/getting_started/#example-use-luna-ai-llama2-model-with-docker-compose):
- verify it works by browsing to [http://localhost:8080/v1/models](http://localhost:8080/v1/models)
(or the IP:Port of the machine, if running remotely) and seeing listed the model(s) you downloaded
listed in the JSON response.
- clone LocalAI
- get the model
- copy the prompt template
- start docker
- -> the server will be listening on `localhost:8080`
- verify it works by going to [http://localhost:8080/v1/models](http://localhost:8080/v1/models) on
your browser and seeing listed the model you downloaded
### Integrating LocalAI with big-AGI
### Integration: chat with LocalAI
- Go to Models > Add a model source of type: **LocalAI**
- Enter the address: `http://localhost:8080` (default)
- If running remotely, replace localhost with the IP of the machine. Make sure to use the **IP:Port** format
- Load the models
-Select model & Chat
- Enter the default address: `http://localhost:8080`, or the address of your localAI cloud instance
a specialized interface that includes a custom variant of the OpenAI API for a smooth integration process.
_Last updated on Dec 7, 2023_
### Components
The implementation of local LLMs involves the following components:
* **text-generation-webui**: A Python application with a Gradio web UI for operating Large Language Models.
* **Local Large Language Models "LLMs"**: Use large language models on your personal computer with consumer-grade GPUs or CPUs.
* **big-AGI**: An LLM UI that offers features such as Personas, OCR, Voice Support, Code Execution, AGI functions, and more.
## Instructions
This guide assumes that **big-AGI** is already installed on your system. Note that the text-generation-webui IP address must be accessible from the server running **big-AGI**.
# Customizing and Creating Derivative Applications
This document outlines how to develop applications derived from big-AGI.
## Manual Customization
Application customization _requires manual code modifications or the use of environment variables_. Currently, **there is no admin panel to "managed" deployment customization** for enterprise use cases.
| - Persona changes<br>- UI theme customization<br>- Feature additions or modifications | - Setting API keys in [environment variables](environment-variables.md)<br>- Toggling features with environment variables |
| Apply these to the source code before building the application | Set these post-build on local machines or cloud deployment, before application launch |
<br/>
## Code Alterations
Start by creating a fork of the [big-AGI repository](https://github.com/enricoros/big-AGI) on GitHub for a personal development space.
Understand the Architecture: big-AGI uses Next.js, React for the front end, and Node.js (Next.js edge functions) for the back end.
### Add Authentication
This necessitates a code change (file renaming) before build initiation, detailed in [deploy-authentication.md](deploy-authentication.md).
### Increase Vercel Functions Timeout
For long-running operations, Vercel allows paid deployments to increase the timeout on Functions.
Note that this applies to old-style Vercel Functions (based on Node.js) and not the new Edge Functions.
At time of writing, big-AGI has only 2 operations that run on Node.js Functions:
browsing (fetching web pages) and sharing. They both can exceed 10 seconds, especially
when fetching large pages or waiting for websites to be completed.
From the Vercel Project > Settings > General > Build & Development Settings,
you can for instance set the build command to:
```bash
next build
```
### Change the Personas (v1.x only)
Edit the `src/data.ts` file to customize personas. This file houses the default personas. You can add, remove, or modify these to meet your project's needs.
- [ ] Modify `src/data.ts` to alter default personas
### Change the UI
Adapt the UI to match your project's aesthetic, incorporate new features, or exclude unnecessary ones.
- [ ] Modify `src/common/app.config.tsx` to alter the application's name
- [ ] Update `src/common/app.nav.tsx` to revise the navigation bar
### Add a Message of the Day
You can display a temporary announcement banner at the top of the app using the `NEXT_PUBLIC_MOTD` environment variable.
- Set this variable in your deployment environment
- The message supports template variables:
-`{{app_build_hash}}`: Current git commit hash
-`{{app_build_pkgver}}`: Package version
-`{{app_build_time}}`: Build timestamp as date
-`{{app_deployment_type}}`: Deployment type (local, docker, vercel, etc.)
- Users can dismiss the message (until next page refresh)
- Use it for version announcements, maintenance notices, or feature highlights
Example: `NEXT_PUBLIC_MOTD=🚀 New features available in {{app_build_pkgver}}! Try the improved Beam.`
## Testing & Deployment
Test your application thoroughly using local development (refer to README.md for local build instructions). Deploy using your preferred hosting service. big-AGI supports deployment on platforms like Vercel, Docker, or any Node.js-compatible service, especially those supporting NextJS's "Edge Runtime."
- [deploy-cloudflare.md](deploy-cloudflare.md): for Cloudflare Workers deployment
- [deploy-docker.md](deploy-docker.md): for Docker deployment instructions and examples
- [deploy-k8s.md](deploy-k8s.md): for Kubernetes deployment instructions and examples
## Debugging
The application includes a client-side logging system. You can view recent logs via the UI (Settings > Tools > Logs).
For deeper debugging during development:
1.**Debug Page**: Access the `/info/debug` page for an overview of the application's environment, configuration, API status, and environment variables available to the client.
2.**Conditional Breakpoints**: To automatically pause execution in your browser's developer tools when critical errors (`error`, `critical`, `DEV` levels) are logged to the console, set the following environment variable in your local `.env.local` file and restart your development server:
```bash
NEXT_PUBLIC_DEBUG_BREAKS=true
```
This allows you to inspect the application state at the exact moment an important error occurs. This feature only works in development mode (`npm run dev`) and requires the environment variable to be explicitly set to `true`.
<br/>
## Community Projects - Share Your Project
After deployment, share your project with the community. We will link to your project to help others discover and learn from your work.
| 🚀 CoolAGI: Where AI meets Imagination<br/> | Code Interpreter, Vision, Mind maps, Web Searches, Advanced Data Analytics, Large Data Handling and more! | [nextgen-user/CoolAGI](https://github.com/nextgen-user/CoolAGI) |
For public projects, update your README.md with your modifications and submit a pull request to add your project to our list, aiding in its discovery.
<br/>
## Best Practices
- **Stay Updated**: Frequently merge updates from the main big-AGI repository to incorporate bug fixes and new features.
- **Keep It Open Source**: Consider maintaining your derivative as open source to foster community contributions.
- **Engage with the Community**: Leverage platforms like GitHub, Discord, or Reddit for feedback, collaboration, and project promotion.
Developing a derivative application is an opportunity to explore new possibilities with AI and share your innovations with the global community. We look forward to seeing your contributions.
- What: page views, page leave events, user interactions, and deployment context
PostHog provides comprehensive product analytics with privacy controls. It helps understand how users interact with big-AGI's features, identify opportunities for improvement, and optimize the user experience.
To enable PostHog, set the `NEXT_PUBLIC_POSTHOG_KEY` environment variable at build time. PostHog is configured with tracking optimization and privacy in mind:
- Uses a proxy endpoint (`/a/ph`) to avoid ad blockers
- Respects user opt-out preferences via local storage
- Tracks only essential information without PII
- Adds deployment context for better segmentation
The implementation follows PostHog's best practices for Next.js applications and includes manual page view tracking for proper single-page application support.
### Vercel Analytics
- Why: understand coarse traction, and identify deployment issues - all without tracking individual users
- What: top pages, top referrers, country of origin, operating system, browser, and page speed metrics
Vercel Analytics and Speed Insights are local API endpoints deployed to your domain, so everything stays within your
domain. Furthermore, the Vercel Analytics service is privacy-friendly, and does not track individual users.
This service is avaialble to system administrators when deploying to Vercel. It is automatically enabled when deploying to Vercel.
The code that activates Vercel Analytics is located in the `src/pages/_app.tsx` file:
| Your **Source** builds of big-AGI | None | **Google Analytics**: set environment variable at build time · **PostHog**: set environment variable at build time · **Vercel**: enable Vercel Analytics from the dashboard |
| Your **Docker** builds of big-AGI | None | (**Vercel**: n/a) · **Google Analytics**: set environment variable at `docker build` time · **PostHog**: set environment variable at `docker build` time. |
| [get.big-agi.com](https://get.big-agi.com) (**Big-AGI 1.x Legacy**) | Vercel + Google + PostHog | The main website ([privacy policy](https://big-agi.com/privacy)) hosted for free for anyone. |
| [prebuilt Docker packages](https://github.com/enricoros/big-AGI/pkgs/container/big-agi) (**Big-AGI 1.x**, 'latest' tag) | Google Analytics | **Vercel**: n/a · **Google Analytics**: set to the big-agi.com Google Analytics for analytics and improvements · **PostHog**: n/a |
Note: this information is updated as of March 3, 2025 and can change at any time.
- **Highly Recommended:** More than a database, it's a data platform. MongoDB Atlas is a robust cloud-based platform that offers scalability, security, and a suite of developer tools. No need for a separate vector database, you can query your vector embeddings right within your operational database!
- **Additional Features:** MongoDB Atlas is packed with unique features designed to streamline the development process such as: Atlas App Services, Atlas search (with vector search), Atlas charts, Data Federation, and more.
- **Highly Recommended:** More than a database, it's a data platform. MongoDB Atlas is a robust cloud-based platform that offers scalability, security, and a suite of developer tools. No need for a separate vector database, you can query your vector embeddings right within your operational database!
- **Additional Features:** MongoDB Atlas is packed with unique features designed to streamline the development process such as: Atlas App Services, Atlas search (with vector search), Atlas charts, Data Federation, and more.
- **Connection String:** Replace placeholders with your Atlas credentials.
| `OPENAI_API_KEY` | API key for OpenAI | Recommended |
| `OPENAI_API_HOST` | Changes the backend host for the OpenAI vendor, to enable platforms such as Helicone and CloudFlare AI Gateway | Optional |
| `OPENAI_API_ORG_ID` | Sets the "OpenAI-Organization" header field to support organization users | Optional |
| `AZURE_OPENAI_API_ENDPOINT` | Azure OpenAI endpoint - host only, without the path| Optional, but if set `AZURE_OPENAI_API_KEY` must also be set |
| `AZURE_OPENAI_API_KEY`| Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md) | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
| `ANTHROPIC_API_KEY` | The API key for Anthropic | Optional |
| `ANTHROPIC_API_HOST` | Changes the backend host for the Anthropic vendor, to enable platforms such as [config-aws-bedrock.md](config-aws-bedrock.md) | Optional |
| `GEMINI_API_KEY` | The API key for Google AI's Gemini | Optional |
| `MISTRAL_API_KEY` | The API key for Mistral | Optional |
| `OLLAMA_API_HOST` | Changes the backend host for the Ollama vendor. See [config-ollama.md](config-ollama.md) | |
| `OPENROUTER_API_KEY` | The API key for OpenRouter | Optional |
| `TOGETHERAI_API_KEY` | The API key for Together AI | Optional |
| `OPENAI_API_KEY` | API key for OpenAI | Recommended |
| `OPENAI_API_HOST` | Changes the backend host for the OpenAI vendor, to enable platforms such as Helicone and CloudFlare AI Gateway | Optional |
| `OPENAI_API_ORG_ID` | Sets the "OpenAI-Organization" header field to support organization users | Optional |
| `ALIBABA_API_HOST` | The Alibaba AI OpenAI-compatible endpoint | Optional |
| `ALIBABA_API_KEY` | The API key for Alibaba AI | Optional |
| `AZURE_OPENAI_API_ENDPOINT` | Azure OpenAI endpoint - host only, without the path | Optional, but if set `AZURE_OPENAI_API_KEY` must also be set |
| `AZURE_OPENAI_API_KEY`| Azure OpenAI API key, see [config-azure-openai.md](config-azure-openai.md) | Optional, but if set `AZURE_OPENAI_API_ENDPOINT` must also be set |
| `AZURE_OPENAI_DISABLE_V1` | Disables the next-generation v1 API for GPT-5-like models (set to 'true' to disable) | Optional, defaults to enabled |
| `AZURE_OPENAI_API_VERSION` | API version for traditional deployment-based endpoints | Optional, defaults to '2025-04-01-preview' |
| `AZURE_DEPLOYMENTS_API_VERSION` | API version for the deployments listing endpoint | Optional, defaults to '2023-03-15-preview' |
| `ANTHROPIC_API_KEY` | The API key for Anthropic | Optional |
| `ANTHROPIC_API_HOST` | Changes the backend host for the Anthropic vendor, to enable platforms such as AWS Bedrock | Optional |
| `DEEPSEEK_API_KEY` | The API key for Deepseek AI | Optional |
| `GEMINI_API_KEY` | The API key for Google AI's Gemini | Optional |
| `GROQ_API_KEY` | The API key for Groq Cloud | Optional |
| `LOCALAI_API_HOST` | Sets the URL of the LocalAI server, or defaults to http://127.0.0.1:8080 | Optional |
| `LOCALAI_API_KEY` | The (Optional) API key for LocalAI | Optional |
| `MISTRAL_API_KEY` | The API key for Mistral | Optional |
| `MOONSHOT_API_KEY` | The API key for Moonshot AI | Optional |
| `OLLAMA_API_HOST` | Changes the backend host for the Ollama vendor. See [config-local-ollama.md](config-local-ollama.md) | |
| `OPENPIPE_API_KEY` | The API key for OpenPipe | Optional |
| `OPENROUTER_API_KEY` | The API key for OpenRouter | Optional |
| `PERPLEXITY_API_KEY` | The API key for Perplexity | Optional |
| `TOGETHERAI_API_KEY` | The API key for Together AI | Optional |
| `XAI_API_KEY` | The API key for xAI | Optional |
### Model Observability: Helicone
### LLM Observability: Helicone
Helicone provides observability to your LLM calls. It is a paid service, with a generous free tier.
It is currently supported for:
@@ -96,7 +126,7 @@ It is currently supported for:
|--------------------|--------------------------|
| `HELICONE_API_KEY` | The API key for Helicone |
### Specials
### Features
Enable the app to Talk, Draw, and Google things up.
@@ -109,13 +139,28 @@ Enable the app to Talk, Draw, and Google things up.
| `NEXT_PUBLIC_DEBUG_BREAKS` | (optional, development) When set to 'true', enables automatic debugger breaks on DEV/error/critical logs in development builds |
| `NEXT_PUBLIC_MOTD` | Message of the Day - displays a dismissible banner at the top of the app (see [customizations](customizations.md) for the template variables). Example: 🔔 Welcome to our deployment! Version {{app_build_pkgver}} built on {{app_build_time}}. |
| `NEXT_PUBLIC_GA4_MEASUREMENT_ID` | (optional) The measurement ID for Google Analytics 4. (see [deploy-analytics](deploy-analytics.md)) |
| `NEXT_PUBLIC_POSTHOG_KEY` | (optional) Key for PostHog analytics. (see [deploy-analytics](deploy-analytics.md)) |
| `NEXT_PUBLIC_PLANTUML_SERVER_URL` | The URL of the PlantUML server, used for rendering UML diagrams. Allows using custom local servers. |
> Important: these variables must be set at build time, which is required by Next.js to pass them to the frontend.
> This is in contrast to the backend variables, which can be set when starting the local server/container.
---
For a higher level overview of backend code and environment customization,
see the [big-AGI Customization](customizations.md) guide.
> 🚨 This file is not meant for publication, and it's just been created as a handbook with tips
> and tricks to make Big-AGI more efficient and productive. 🚨
Welcome to the advanced tips and tricks guide for Big-AGI. This document will help you make the most of the platform's existing features.
---
## Hidden Gems
- **Shift + Double-Click** on a chat message to **edit** it.
- **Shift + Trash Icon** to **delete** a chats and messages without confirmation.
- also applies elsewhere: delete Attachments, etc.
- **Shift + Click** on **New Chat** to create an incognito chat.
- Drag a big-AGI saved chat into Big-AGI to load (or attach) it.
## Not-so-obvious Shortcuts
- When sending a message:
- Enter is for newlines
- **Shift + Enter** to send the message.
- **Ctrl + Enter** to **Beam** the message.
- **Alt/Option + Enter** to send the message without an answer.
- When editing a message:
- **Ctrl + Enter** to **Save** the changes.
- **Shift + Ctrl + Enter** to **Save & Regenerate**.
- Scroll between messages:
- **Ctrl + Up/Down** to scroll between **messages** and/or **Beams**.
## Worth the Effort:
- [LiveFile](help-feature-livefile.md) works on **Chrome**: Pair and synchronize your documents and code blocks with files on your local system: refresh, save, update them.
## Best User Hacks:
-
---
Note: this document is just at the beginning. It's here so we can capture
Big-AGI is a **client-first** web application, which means it prioritizes speed and data ownership compared to cloud apps.
Your *API keys*, *chat history*, and *settings* live in your
browser's [local storage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage), not
on cloud servers.
You can use Big-AGI in two ways:
1. Run it yourself (open-source)
2. Use big-agi.com (hosted service)
This guide explains how the open-source version handles your data. You can verify everything in [the source code](https://github.com/enricoros/big-agi).
## Client-Side Storage
Within Big-AGI almost all chat/keys data is handled client-side in your browser using two
standard browser storage mechanisms:
- **Local Storage**: API keys, settings, and configurations ([learn more](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage))
- **IndexedDB**: Chat history and larger files ([learn more](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API))
The Big-AGI backend mainly passes requests to AI services (OpenAI, Anthropic, etc.). It doesn't store your data, except for the chat-sharing function if used.
You can see your data in your browser's local storage and IndexedDB - try it yourself:
1. In Chrome: Open DevTools (press F12 on Windows, ⌘ + ⌥ + I on Mac)
2. Click 'Application' > 'Local Storage'
3. See your settings and API keys

### What This Means For You
Storing data in your browser means:
- Your data stays on **one device/browser only**
- Clearing browser data **erases your chats** - make backups
- Anyone using your browser can see your chats and keys
- Running your own server needs technical skills
### Local Device Identifier
Big-AGI generates a _device identifier_ that combines timestamp and random components, stored only on your device. This identifier:
- Is used only for the **optional sync functionality** between your devices (not yet ready)
- Helps maintain data consistency when using Big-AGI across multiple devices
- Remains completely local unless you explicitly enable sync
- Is not used for tracking, analytics, or telemetry
- Can be deleted anytime by clearing local storage
- Is fully transparent - see the implementation in `src/common/stores/store-client.ts`
## How Data Flows
AI interactions in Big-AGI, such as chats, AI titles, text to speech, browsing, flow through three components:
1.**Browser** (client/installed App) - Stores your keys & data locally
2.**Backend** (routing server) - Passes requests to AI services
3.**AI Services** - Where the actual AI processing happens
### Self-Deployed Version: Your Infrastructure
You run the server. Your data only leaves when making AI requests.
The keys and chats are under your control and pass through your code, and are sent to
# LiveFile: Synchronize Your Documents with Local Files
## Introduction
**LiveFile** is a powerful feature in big-AGI that allows you to **pair and synchronize
your documents and code blocks** with files on your local system.
This feature enables a **two-way connection between big-AGI and your local files on disk**,
saving you time and effort.
With LiveFile, you can:
- **Pair** documents and code blocks with local files.
- **Monitor** changes in local files and update content in big-AGI.
- **Refresh** chat attachments with the latest content.
- **Save** edits made in big-AGI back to your local files.
- **Store** AI-generated code and content.
---
## Requirements
- **Supported Browsers:**
- **Google Chrome** (desktop)
- **Microsoft Edge** (desktop)
- **Operating Systems:**
- **Desktop platforms only**
- **Note:** Mobile devices (iOS and Android) are **not supported** due to browser limitations.
- **File Types:**
- Designed for **text-based files** (e.g., `.txt`, `.md`, `.js`, `.py`).
- **Performance:**
- Can handle **dozens of files efficiently**.
- **Limitations:**
- **File Size Limit**:
- Supports text files up to **10 MB**.
- **Pairing Persistence:**
- LiveFile connections **do not persist across sessions**.
- After reloading the page, you will need to re-pair your files.
- **Saving Overwrites:**
- Saving changes in big-AGI will **overwrite the entire file**.
- Use external tools for version control or incremental backups.
---
## Enabling LiveFile
LiveFile can be enabled automatically or manually in your Big-AGI workflow.
### Automatic Pairing
When you:
- **Attach**, **drop**, or **paste** a file into a chat message,
LiveFile is **automatically enabled** for that attachment. This means you can start
monitoring and reloading changes without any additional setup.
### Manual Pairing
For existing attachments or code blocks that:
- **Do not have LiveFile enabled** (e.g., created on other devices),
- **Are AI-generated code snippets without an associated file**,
You can manually pair them with a local file.
#### Pairing Attachments
1. **Select the Attachment:**
- Click on the attachment in the chat to view it in the previewer.
2. **Initiate Pairing:**
- Click on **"Pair File"** (🔗).
- If you have open LiveFiles, they will be listed for easy selection.
- Alternatively, you can select a new file from your local system.
3. **Grant Permissions**
- When prompted, allow big-AGI to access the file.
#### Pairing Code Blocks
1. **Access Code Block Options:**
- Click on the code block to reveal the header with options.
2. **Initiate Pairing:**
- Click the **"Pair File"** button (🔗).
- Select from your open LiveFiles or choose a new file.
3. **Confirm Pairing:**
- Grant permission when prompted.
---
## Using LiveFile
### Monitoring Changes
- **Automatic Monitoring:**
- LiveFile watches for changes in your paired local files.
- If the file is modified outside of big-AGI, you'll be shown the changes in the LiveFile bar.
- There is also a **"Replace with File"** option to manually load the latest content and see the changes.
- **Refreshing Content:**
- Click **"Replace with File"** (🔄) to load the latest content from the paired file into big-AGI.
### Saving Edits Back to Paired Files
- **Editing Attachments or Code Blocks:**
- Modify the content directly within big-AGI.
- Attachments: Click on the attachment to open the previewer and click on "Edit" to make changes.
- Code Blocks: Select "Edit" on the chat message to update code blocks.
- **Saving Changes:**
- Click **"Save to File"** (💾) to overwrite the local file with your changes.
- **Note:** This action overwrites the entire file. Ensure this is what you want before proceeding.
---
## Best Practices
- **Monitor External Changes:**
- Refresh content in big-AGI if the local file has been modified outside the application.
- **Use a Version Control System:**
- For critical files, consider using Git or other version control systems to track and monitor changes, authorship, and history.
---
## Troubleshooting
- **LiveFile Options Not Visible:**
- Ensure you are using a **supported desktop browser**.
- Check that you have the latest version of big-AGI.
- **Permission Issues:**
- Confirm that you granted big-AGI permission to access your files.
- Check your browser's settings to ensure file access is allowed.
---
## Technical Details
LiveFile uses the [File System Access API](https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API) to
interact with your local files securely. It leverages the [browser-fs-access](https://github.com/GoogleChromeLabs/browser-fs-access) library,
an open-source project by Google Chrome Labs, which provides an easy interface to the File System Access API with fallbacks for broader browser support.
- **Security:**
- Access to files requires explicit user permission.
- **Performance:**
- Designed to handle dozens of files efficiently (tested on hundreds).
- Works with the Big-AGI attachment system to recursively add directories.
- **Browser Support:**
- Fully supported on **Google Chrome** and **Microsoft Edge** desktop versions.
---
## Another Big-AGI First!
You can significantly boost your productivity and streamline your workflow within big-AGI
by understanding how to utilize LiveFile's features fully.
This Feature is in Beta as there are a few limitations and improvements to be made.
Join us in enjoying and enhancing this feature on [big-AGI.com](https://big-agi.com), or
[GitHub](https://github.com/enricoros/big-AGI) for support and [Discord](https://discord.gg/MkH4qj2Jp9)
# Enabling Microphone Access for Speech Recognition
This guide explains how to enable microphone access for speech recognition in various browsers and mobile devices.
Ensuring microphone access is essential for using voice features in applications like big-AGI.
## Desktop Browsers
### Google Chrome (All Platforms, recommended)
1. Open the website (e.g., big-AGI) in Chrome.
2. Click the **lock icon** in the address bar.
3. In the dropdown, find **"Microphone"**.
- Set it to **"Allow"**.
4. If "Microphone" isn't listed:
- Click on **"Site settings"**.
- Find **"Microphone"** in the permissions list.
- Change the setting to **"Allow"**.
5. **Refresh** the page.
### Safari (macOS)
**[Watch the video tutorial: How to enable Speech Recognition in Safari](https://vimeo.com/1010342201)**
If you're seeing a "Speech Recognition permission denied" error, follow these steps:
1. Open **System Settings**.
- Go to **Privacy & Security** > **Speech Recognition**.
- Enable Safari in the list of allowed applications.
- Quit & Open Safari.
2. Click **Safari** in the top menu bar.
- Select **Settings**.
- Go to the **Websites** tab.
- Select **Microphone** from the sidebar.
- Find big-AGI (or localhost for developers) in the list and set it to **Allow**.
- Close the Settings window.
3. **Refresh** the page.
This quick and simple fix should get essential voice input working in big-AGI on your Mac.
### Microsoft Edge (Windows)
1. Open the website in Edge.
2. Click the **lock icon** in the address bar.
3. Click **"Permissions for this site"**.
4. Find **"Microphone"**.
- Set it to **"Allow"**.
5. **Refresh** the page.
### Firefox (All Platforms)
> **Note:** The Speech Recognition API is **not supported** in Firefox. If you're using Firefox, please switch to a supported browser to use speech recognition
> features.
## Mobile Devices
### Android (Chrome)
1. Open the website in Chrome.
2. Tap the **lock icon** in the address bar.
3. Tap **"Permissions"**.
4. Find **"Microphone"**.
- Set it to **"Allow"**.
5. **Refresh** the page.
### iOS (Safari)
1. Open the **Settings** app on your device.
2. Scroll down and tap **"Safari"**.
3. Tap **"Microphone"**.
4. Ensure **"Ask"** or **"Allow"** is selected.
5. Return to Safari and open the website.
6. If prompted, allow microphone access.
7. **Refresh** the page.
### iOS (Chrome)
> **Note:** Chrome on iOS uses Safari's engine due to system limitations. Microphone permissions are managed through iOS settings.
1. Open the **Settings** app.
2. Scroll down and tap **"Chrome"**.
3. Ensure **"Microphone"** is toggled **on**.
4. Open Chrome and navigate to the website.
5. If prompted, allow microphone access.
6. **Refresh** the page.
## Troubleshooting
If you're still experiencing issues after enabling microphone access:
**Check System Permissions (macOS):**
- Open **System Settings**.
- Go to **"Privacy & Security"**.
- Select the **"Privacy"** tab.
- Click **"Microphone"** in the sidebar.
- Ensure your browser (e.g., Chrome, Safari) is checked.
- You may need to unlock the settings by clicking the lock icon at the bottom.
**Check Microphone Access (Windows):**
- Open **Settings**.
- Go to **"Privacy"** > **"Microphone"**.
- Ensure **"Allow apps to access your microphone"** is **on**.
- Scroll down and make sure your browser is allowed.
**Close Other Applications:**
- Close any applications that might be using the microphone.
**Restart the Browser:**
- Close all browser windows and reopen.
**Update Your Browser:**
- Ensure you're using the latest version.
**Check for Browser Extensions:**
- Disable extensions that might block access to the microphone.
For persistent issues, consult your browser's official support resources or contact big-AGI support.
## Technical Details
Big-AGI uses the [Web Speech API (SpeechRecognition)](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition)
to transcribe spoken words into text. This API provides real-time transcription with live previews and works on most
Your big-AGI instance is now running at `http://localhost:3000`.
### Local Production build
The production build is optimized for performance and follows
the same steps 1 and 2 as for [local development](#local-development).
3. Build the production version:
```bash
# .. repeat the steps above up to `npm install`, then:
npm run build
```
4. Start the production server (`npx` may be optional):
```bash
npx next start --port 3000
```
Your big-AGI production instance is on `http://localhost:3000`.
### Advanced Customization
Want to pre-enable models, customize the interface, or deploy with username/password or alter code to your needs?
Check out the [Customizations Guide](README.md) for detailed instructions.
## ☁️ Cloud Deployment Options
To deploy big-AGI on a public server, you have several options. Choose the one that best fits your needs.
### Deploy on Vercel
Install big-AGI on Vercel with just a few clicks.
Create your GitHub fork, create a Vercel project over that fork, and deploy it. Or press the button below for convenience.
[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI&env=OPENAI_API_KEY&envDescription=Backend%20API%20keys%2C%20optional%20and%20may%20be%20overridden%20by%20the%20UI.&envLink=https%3A%2F%2Fgithub.com%2Fenricoros%2Fbig-AGI%2Fblob%2Fmain%2Fdocs%2Fenvironment-variables.md&project-name=big-AGI)
### Deploy on Cloudflare
Deploy on Cloudflare's global network by installing big-AGI on
Cloudflare Pages. Check out the [Cloudflare Installation Guide](deploy-cloudflare.md)
for step-by-step instructions.
### Docker Deployments
Containerize your big-AGI installation using Docker for portability and scalability.
Our [Docker Deployment Guide](deploy-docker.md) will walk you through the process,
or follow the steps below for a quick start.
1. (optional) Build the Docker image - if you do not want to use the [pre-built Docker images](https://github.com/enricoros/big-AGI/pkgs/container/big-agi):
```bash
docker build -t big-agi .
```
2. Run the Docker container with either:
```bash
# 2A. if you built the image yourself:
docker run -d -p 3000:3000 big-agi
# 2B. or use the pre-built image:
docker run -d -p 3000:3000 ghcr.io/enricoros/big-agi
# 2C. or use docker-compose:
docker-compose up
```
Access your big-AGI instance at `http://localhost:3000`.
If you deploy big-AGI behind a reverse proxy, you may want to check out the [Reverse Proxy Configuration Guide](deploy-reverse-proxy.md).
### Kubernetes Deployment
Deploy big-AGI on a Kubernetes cluster for enhanced scalability and management. Follow these steps for a Kubernetes deployment:
Your big-AGI instance is now accessible at `http://localhost:3000`.
For more detailed instructions on Kubernetes deployment, including updating and troubleshooting, refer to our [Kubernetes Deployment Guide](deploy-k8s.md).
### Midori AI Subsystem for Docker Deployment
Follow the instructions found on [Midori AI Subsystem Site](https://io.midori-ai.xyz/subsystem/manager/)
for your host OS. After completing the setup process, install the Big-AGI docker backend to the Midori AI Subsystem.
## Enterprise-Grade Installation
For businesses seeking a fully-managed, scalable solution, consider our managed installations.
Enjoy all the features of big-AGI without the hassle of infrastructure management. [hello@big-agi.com](mailto:hello@big-agi.com) to learn more.
## Support
Join our vibrant community of developers, researchers, and AI enthusiasts. Share your projects, get help, and collaborate with others.
| Chat | Just type and send | **Pre-trained knowledge only** | Only shows final response | Quick answers, general knowledge queries |
| ReAct | Type "/react" before the question | **Web loads, Web searches, Wikipedia, calculations** | Shows step-by-step thought process | Complex, multi-step, or research-based questions |
Example of ReAct in action, taking a question about current events, googling results, opening a page, and summarizing the information:
- **browse**: loads web pages (URLs) and extracts information, using a correctly configured `Tools > Browsing` API
- **search**: searches the web to produce page URLs, using a correctly configured `Tools > Google Search` ([Google Programmable Search Engine](https://programmablesearchengine.google.com/about/)) API
- **wikipedia**: looks up information on Wikipedia pages
- **calculate**: performs mathematical calculations by executing typescript code
- warning: (!) unsafe and dangerous, do not use for untrusted code/LLMs
## How to Use ReAct in Big-AGI
1. **Invoking ReAct**: Type "/react" followed by your question in the chat.
2. **What to Expect**:
- An ephemeral space will show the AI's thought process and actions, showing all the steps taken.
- The final answer will appear in the main chat.
3. **Available Actions**: Web searches, Wikipedia lookups, calculations, and optionally web browsing.
## Good to know:
- **ReAct operates in isolation** from the main chat history.
- It **will take longer than standard responses** due to multiple steps.
- Web searches and browsing may have privacy implications, and require **tool configuration** in the UI.
- Errors or limitations in accessing external resources may affect results.
- ReAct does not use the [Tool or Function Calling](https://platform.openai.com/docs/guides/function-calling) feature of AI models, rather uses the old school approach of parsing and executing actions.
- **[AIX-callers-analysis.md](modules/AIX-callers-analysis.md)** - Analysis of AIX entry points, call chains, common and different rendering, error handling, etc.
#### CSF - Client-Side Fetch
- **[CSF.md](systems/client-side-fetch.md)** - Direct browser-to-API communication for LLM requests
### Systems Documentation
#### Core Platform Systems
- **[app-routing.md](systems/app-routing.md)** - Next.js routing, provider stack, and display state hierarchy
- **[LLM-parameters-system.md](systems/LLM-parameters-system.md)** - Language model parameter flow across the system
## Guidelines
### Writing Style
- **Direct and factual** - No marketing language
- **Present tense** - "AIX handles streaming" not "AIX will handle"
- **Active voice** - "The system processes" not "Processing is done by"
- **Concrete examples** - Show actual code/config when helpful, briefly
- Use `messageWasInterruptedAtStart()` for consistent detection
- Only remove messages with no content that were client-aborted
- Consider UI context when choosing removal vs. clearing strategy
### Error Handling
- **Fragment-level**: Use `messageEditor.fragmentReplace()` with error fragments
- **Message-level**: Use `messageEditor.removeMessage()` or array removal
- **Status-level**: Update component state for UI feedback
### Placeholder Management
- Initialize with descriptive placeholders using `createPlaceholderVoidFragment()`
- Replace during streaming updates
- Clean up on error to prevent stale content
## Architectural Insights
1. **Layered Error Handling**: Sophistication increases closer to UI
2. **Context Specialization**: Different contexts for different use cases
3. **Streaming vs Non-Streaming**: Conversation functions stream, utilities typically don't
4. **Message vs Fragment Management**: Different strategies for different UI needs
The most sophisticated handling is in **Beam modules** and **Chat Persona** with comprehensive removal logic, while simpler callers rely on upstream error handling.
- 1: Continuation marks: a. sends reason=max-tokens (streaming/non-streaming), b. TBA
- 2: Ollama has not been ported to AIX yet due to the custom APIs.
## 1. System Architecture
The subsystem comprises three main components:
1. **Client (e.g. Next.js Frontend)**
- Initiates requests
- Renders AI-generated content in real-time
- Reconstructs streamed data
2. **Server (e.g. Next.js Backend)**
- Acts as an intermediary between client and AI providers
- Handles request preparation, dispatching, and response processing
- Streams responses back to the client
3. **Upstream AI Providers**
- Generate AI content based on requests
### ChatGenerate workflow:
1. Request Initialization: AIX Client prepares and sends request (systemInstruction, messages=AixWire_Parts[], etc.) to AIX Server
2. Dispatch Preparation: AIX Server prepares for upstream communication
3. AI Provider Interaction: AIX Server communicates with AI Provider (streaming or non-streaming)
4. Data Decoding, Transformation and Transmission: AIX Server sends AixWire_Particles to AIX Client
5. Client-side Processing: Client's ContentReassembler processes AixWire_Particles into a list (likely a single) of multi-fragment (DMessageContentFragment[]) messages
6. Completion: AIX Server sends 'done' control message, AIX Client finalizes data update
7. Error Handling: AIX Server sends specific error messages when necessary
## 2. Files and Folders
AIX is organized into the following files and folders:
1. Client-Side (`/client/`):
- `aix.client.ts`: Main client-side entry point for AIX operations.
- `aix.client.chatGenerateRequest.ts`: Handles conversion of chat messages to AIX-compatible format (AixWire_Content, AixWire_Parts, etc.).
2. Server-Side (`/server/`):
- API (`/server/api/`) - Client to Server communication:
- `aix.router.ts`: Defines the tRPC router for AIX operations.
- `aix.wiretypes.ts`: Contains Zod schemas for types and calls incoming from the client (AixWire_Parts, AixWire_Content, AixWire_Tooling, AixWire_API, ...), and outgoing (AixWire_Particles)
- Dispatch (`/server/dispatch/`) - Server to AI Provider communication:
- `/server/dispatch/chatGenerate/`: Content Generation with chat-style inputs:
- `./adapters/`: Adapters for creating API requests for different AI protocols (Anthropic, Gemini, OpenAI).
- `./parsers/`: Parsers for parsing streaming/non-streamin responses from different AI protocols (same 3).
- `chatGenerate.dispatch.ts`: Creates a pipeline to execute Chat Generation to a specific provider.
- `ChatGenerateTransmitter.ts`: Used to serialize and transmit AixWire_Particles to the client.
- `/server/dispatch/wiretypes/`: AI provider Wire Types:
- Type definitions for different AI providers/protocols (Anthropic, Gemini, OpenAI).
- `stream.demuxers.ts`: Handles demuxing of different stream formats.
This document describes how parameters flow through Big-AGI's LLM parameters system, from definition to API invocation.
## System Overview
The LLM parameters system operates across five layers that transform parameters from global definitions to vendor-specific API calls. Each layer serves a specific purpose in the parameter resolution pipeline.
The `DModelParameterRegistry` defines all available parameters with their constraints and metadata. Each parameter includes type information, validation rules, and default behavior.
**Example**: `llmVndOaiReasoningEffort4` defines a 4-value enum with 'medium' as the required fallback.
**Default Value System**: The registry supports multiple default mechanisms:
- `initialValue` - Parameter's base default (e.g., `llmVndOaiRestoreMarkdown: true`)
- `requiredFallback` - Fallback for required parameters (e.g., `llmTemperature: 0.5`)
- `nullable` - Parameters that can be explicitly null to skip API transmission
{ paramId: 'llmVndGeminiThinkingBudget', rangeOverride: [0, 8192] }, // Custom range
]
```
**Parameter Visibility**: The `hidden` flag removes parameters from the UI while keeping them functional. Models can also mark parameters as `required`.
### Layer 3: Client Configuration
The system provides two UI configurators with different scopes:
**Hidden Parameters**: Parameters like `llmRef` are marked `hidden: true` in the registry and never appear in the UI, but remain functional for system use.
**Nullable Parameters**: Parameters with `nullable` configuration can be explicitly set to `null` to prevent transmission to the API, distinct from being undefined.
**Range Overrides**: Models can override parameter ranges (e.g., different Gemini models support different thinking budget ranges).
**Parameter Interactions**: The UI implements business logic like disabling web search when reasoning effort is 'minimal'.
## Type Safety Mechanisms
The system maintains type safety through:
- `DModelParameterId` union from registry keys
- `DModelParameterValue<T>` conditional types for values
- `DModelParameterSpec<T>` interfaces for specifications
- Runtime validation via Zod schemas at API boundaries
## Model Variant Pattern
Some vendors use model variants to enable features, for instance:
- **Anthropic**: Creates separate `idVariant: 'thinking'` entries forcing value of hidden parameters
- **Google/OpenAI**: Parameters directly on base models
## Migration and Compatibility
The architecture supports parameter evolution:
- **Version Coexistence**: Both `llmVndOaiReasoningEffort` and `llmVndOaiReasoningEffort4` exist simultaneously
- **Precedence Rules**: Newer parameters take priority during AIX translation
Client-Side Fetch (CSF) enables direct browser-to-API communication, bypassing the server for LLM requests. When enabled, the browser makes requests directly to vendor APIs (e.g., `api.openai.com`, `api.groq.com`) instead of routing through the Next.js server. This reduces latency, decreases server load, and is particularly useful for local models where the browser can communicate directly with Ollama or LM Studio.
## Implementation
CSF is implemented as an opt-in setting stored as `csf: boolean` in each vendor's service settings. The vendor interface exposes `csfAvailable?: (setup) => boolean` to determine if CSF can be enabled (typically checking if an API key or host is configured). The actual execution happens in `aix.client.direct-chatGenerate.ts` which dynamically imports when CSF is active, making direct fetch calls using the same wire protocols as the server.
All 16 supported vendors (OpenAI, Anthropic, Gemini, Ollama, LocalAI, Deepseek, Groq, Mistral, xAI, OpenRouter, Perplexity, Together AI, Alibaba, Moonshot, OpenPipe, LM Studio) support CSF. Cloud vendors require CORS support from the API provider (all tested vendors return `access-control-allow-origin: *`). Local vendors (Ollama, LocalAI, LM Studio) require CORS to be enabled on the local server.
## UI
The CSF toggle appears in each vendor's setup panel under "Advanced" settings, labeled "Direct Connection". It becomes visible when the prerequisites are met (API key present for cloud vendors, host configured for local vendors). The setting is managed through `useModelServiceClientSideFetch` hook which provides `csfAvailable`, `csfActive`, `csfToggle`, and `csfReset` for UI consumption.
process.env.NEXT_PUBLIC_DEPLOYMENT_TYPE=process.env.NEXT_PUBLIC_DEPLOYMENT_TYPE||(process.env.VERCEL_ENV?`vercel-${process.env.VERCEL_ENV}`:'local');// Docker or custom, Vercel
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.