diff --git a/README.md b/README.md index a24590e9f..5c4d806d7 100644 --- a/README.md +++ b/README.md @@ -144,7 +144,7 @@ NOTE: this is a powerful tool - if you need a toy UI or clone, this ain't it. ## Release Notes 馃憠 **[See the Live Release Notes](https://big-agi.com/changes)** -- Open 2.0.1: **Opus 4.5** full support, **Gemini 3 Pro** w/ code exec, **Nano Banana Pro**, **Grok 4.1**, **GPT-5.1**, **Kimi K2 Thinking** + 280 fixes +- Open 2.0.2: **Speex** multi-vendor speech synthesis, **Opus 4.5**, **Gemini 3 Pro**, **Nano Banana Pro**, **Grok 4.1**, **GPT-5.1**, **Kimi K2** + 280 fixes ### What's New in 2.0 路 Oct 31, 2025 路 Open @@ -332,7 +332,7 @@ Configure 100s of AI models from 18+ providers: | Multimodal services | [Azure](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 路 [Anthropic](https://anthropic.com) 路 [Google Gemini](https://ai.google.dev/) 路 [OpenAI](https://platform.openai.com/docs/overview) | | LLM services | [Alibaba](https://www.alibabacloud.com/en/product/modelstudio) 路 [DeepSeek](https://deepseek.com) 路 [Groq](https://wow.groq.com/) 路 [Mistral](https://mistral.ai/) 路 [Moonshot](https://www.moonshot.cn/) 路 [OpenPipe](https://openpipe.ai/) 路 [OpenRouter](https://openrouter.ai/) 路 [Perplexity](https://www.perplexity.ai/) 路 [Together AI](https://www.together.ai/) 路 [xAI](https://x.ai/) | | Image services | OpenAI 路 Google Gemini | -| Speech services | [ElevenLabs](https://elevenlabs.io) (Voice synthesis / cloning) | +| Speech services | [ElevenLabs](https://elevenlabs.io) 路 [OpenAI TTS](https://platform.openai.com/docs/guides/text-to-speech) 路 LocalAI 路 Browser (Web Speech API) | ### Additional Integrations diff --git a/docs/README.md b/docs/README.md index acda68da0..c0e19476f 100644 --- a/docs/README.md +++ b/docs/README.md @@ -43,7 +43,7 @@ How to set up AI models and features in big-AGI. - **[Web Browsing](config-feature-browse.md)**: Enable web page download through third-party services or your own cloud - **Web Search**: Google Search API (see '[Environment Variables](environment-variables.md)') - **Image Generation**: GPT Image (gpt-image-1), DALL路E 3 and 2 - - **Voice Synthesis**: ElevenLabs API for voice generation + - **Voice Synthesis**: ElevenLabs, OpenAI TTS, LocalAI, or browser Web Speech API ## Deployment & Customization diff --git a/docs/environment-variables.md b/docs/environment-variables.md index d8b53df1e..0208a7a1d 100644 --- a/docs/environment-variables.md +++ b/docs/environment-variables.md @@ -132,10 +132,11 @@ Enable the app to Talk, Draw, and Google things up. | Variable | Description | |:---------------------------|:------------------------------------------------------------------------------------------------------------------------| -| **Text-To-Speech** | [ElevenLabs](https://elevenlabs.io/) is a high quality speech synthesis service | +| **Text-To-Speech** | ElevenLabs, OpenAI TTS, LocalAI, and browser Web Speech API are supported | | `ELEVENLABS_API_KEY` | ElevenLabs API Key - used for calls, etc. | | `ELEVENLABS_API_HOST` | Custom host for ElevenLabs | | `ELEVENLABS_VOICE_ID` | Default voice ID for ElevenLabs | +| | *Note: OpenAI TTS and LocalAI TTS reuse credentials from your configured LLM services (no separate env vars needed)* | | **Google Custom Search** | [Google Programmable Search Engine](https://programmablesearchengine.google.com/about/) produces links to pages | | `GOOGLE_CLOUD_API_KEY` | Google Cloud API Key, used with the '/react' command - [Link to GCP](https://console.cloud.google.com/apis/credentials) | | `GOOGLE_CSE_ID` | Google Custom/Programmable Search Engine ID - [Link to PSE](https://programmablesearchengine.google.com/) |