Files
big-agi/kb/modules/LLM-gemini-interactions.md
T
2026-05-05 03:12:13 -07:00

7.0 KiB

Gemini Interactions API

The Interactions API powers Gemini's agent runs (Deep Research today, more agent types planned). This doc is the source of truth for protocol shape, failure modes, and the recovery model — code comments link here instead of repeating the rationale.

References

  • GH #1088 — Auto-resume for Deep Research; Recover button
  • GH #1095 — Visualizations toggle (agent_config.visualization)
  • Google forum 143098 — 10-min SSE cut
  • Google forum 143099 — Streaming resume re-cuts
  • Upstream specs_upstream/gemini.interactions.spec.md, gemini.interactions.guide.md, gemini.deep-research.guide.md

Endpoints

Verb Path Purpose
POST /v1beta/interactions Start a run. We always send stream:true, background:true, store:true
GET /v1beta/interactions/{id}?stream=true Reattach via SSE replay (full event sequence from start)
GET /v1beta/interactions/{id} Fetch the resource as JSON (one-shot)
POST /v1beta/interactions/{id}/cancel Stop a background run
DELETE /v1beta/interactions/{id} Remove the stored record (does NOT cancel an in-flight run)

Retention: 1 day free, 55 days paid.

Status taxonomy

Status Meaning Handling
in_progress Live run or zombie (see C) Surface diagnostics; offer Resume/Recover/Stop
completed Done with content in outputs[] Emit fragments, tokenStopReason='ok'
failed Server-side failure Terminating issue
cancelled We or another client cancelled Close as cg-issue
incomplete Stopped early (token limit) — partial outputs Note + tokenStopReason='out-of-tokens'
requires_action Not expected for Deep Research Fail loudly so we notice

Two retrieval paths

Path Endpoint Parser Use case
SSE replay GET ?stream=true createGeminiInteractionsParserSSE Canonical resume; live deltas
JSON GET (recovery) GET (no stream) createGeminiInteractionsParserNS Recover when SSE is broken

Both replay from the start — ContentReassembler REPLACES content on reattach, so partial replay (last_event_id) is intentionally NOT used. The NS parser walks outputs[] (thoughts, text, images, audio) and emits the same particles the SSE parser would, in one batch.

Failure modes

A. 10-minute SSE cut (forum 143098)

The SSE connection gets cut at exactly 600 s, regardless of activity. The cut is malformed (JSON error array instead of clean SSE close) and we treat it as stream-closed-early. The run typically continues server-side and reaches completed. Recover (JSON GET) retrieves the full report.

B. Streaming resume re-cuts (forum 143099)

A fresh SSE replay can re-cut at the same 10-minute boundary on long runs, so Resume alone never reaches interaction.complete. Recover is the fallback.

C. Zombie interactions (#1088)

Resource sits in status: in_progress for days with outputs: [] — the generator crashed but the status never transitioned. Not recoverable (no data was ever produced). The NS parser surfaces created, updated, output count, and a "stuck for over an hour" hint so the user can decide to delete and retry.

D. Connection drop mid-run

Network blip; resource is fine. Resume (SSE replay) picks up cleanly.

UI

BlockOpUpstreamResume renders up to three buttons:

Button Action Shown when
Resume SSE replay onResume provided
Recover JSON GET (one-shot) upstreamHandle.uht_NS_RECOVER_UHTS
Stop Cancel + delete upstream resource onDelete provided

The Recover gate is an inline uht === 'vnd.gem.interactions' check in BlockOpUpstreamResume.tsx — extend when another vendor needs the same fallback. Stop is intentionally NOT gated by Resume/Recover busy state — it's the escape hatch for hung resumes.

Visualization control (#1095)

Deep Research accepts agent_config.visualization: 'auto' | 'off'. Exposed as llmVndGeminiAgentViz (label "Visualizations"). Forwarded only when explicitly 'off' so the upstream 'auto' default stays untouched. Useful when merging multiple reports — image fragments break Beam fusion.

Code map

File Role
aix/server/dispatch/wiretypes/gemini.interactions.wiretypes.ts Zod schemas (RequestBody, Interaction, StreamEvent)
aix/server/dispatch/chatGenerate/adapters/gemini.interactionsCreate.ts POST body (input + agent_config)
aix/server/dispatch/chatGenerate/parsers/gemini.interactions.parser.ts SSE parser + NS parser
aix/server/dispatch/chatGenerate/chatGenerate.dispatch.ts (gemini case) Resume dispatch: SSE vs JSON branch
apps/chat/components/message/BlockOpUpstreamResume.tsx Resume / Recover / Stop UI
apps/chat/components/ChatMessageList.tsx (handleMessageUpstreamResume) Wires click handler to aixReattachContent_DMessage_orThrow