Cleaned up the code a bit (thx gpt4), with:
- dynamic module load: JS is chunked up and deferred to PDF loading,
which improves all the sessions where PDFs are not loaded
- unified path for drag/drop and 'load file' (shall call it "magic drop"
so PDFs are text'ified upon drag/dop as well
- fixed "not being able to load the same doc twice" (thx gpt4)
- using minified worker, as it's loaded dynamically, we save ~50% bandwidth
Will be soon turned on by default, when the CSS and rendering is
of higher quality (right now making it default will make
non-Markdown chats look quite different).
Thanks. Great stuff Nils (@nilshulth)
Chats are exported to paste.gg, are unlisted by default, and expire
in 30 days by default. The user is also provided with the deletion
key which will be only shown at the time of creation, and it's needed
to take down the paste.
Rendering looks quite great, including code and turns.
Closes#32. Enable Users / Deployments to change the host where
OpenAI API calls are directed to. This enables project like
[Helicone](https://www.helicone.ai/) (Observability of LLM ops)
for tracking prompt/responses quality in real-time.
Configuration:
- User: App > Settings > Advanced > API host (e.g. "oai.hconeai.com")
- Deployment: set the 'API_API_HOST=...' environment variable
User takes precedence over deployment over api.openai.com. Realtime
switching in chat apps works well.
Note: the Helicone team is fixing dashboard reporting for 'streaming'
over the /v1/chat/completions endpoint.
Note: some older chats will lost the name of the model
but they'll regain it fast. And now the system purpose
that originated the message is captured, as well as
the typing indication