Signal drop!
Relay (operand.online) is unreachable.
Usually, a dropped signal means an upgrade is happening. Hold on!
Sorry, no connección.
Hang in there while we get back on track
gram: docs
> ./openspec/changes/live-agent-chat/spec.md
# Spec: Live Agent Chat with Document Tools
## Problem
Today `DocxReviewer` operates on a **static `Document` model** — you parse a DOCX, the agent reads/comments/proposes, you serialize back to DOCX. There's no connection to the live editor. The `bridge.ts` placeholder exists but is unimplemented.
The goal: a **chat panel next to the document** where an AI agent can read the document content, add comments, suggest changes, and highlight text — all happening live in the editor UI, not just in a serialized file.
## User Experience
┌────────────────────────────────────────┬─────────────────────────┐ │ │ │ │ DOCX Editor │ Agent Chat │ │ │ │ │ ┌──────────────────────────────┐ │ User: Review section 3 │ │ │ Section 3: Payment Terms │ │ for legal issues │ │ │ │ │ │ │ │ The buyer shall pay $50k ←──────────── Agent: I found 2 │ │ │ [💬 Agent: Liability cap...] │ │ issues in section 3: │ │ │ │ │ │ │ │ within 30 days of ←─────────────────── 1. Liability cap at │ │ │ [💬 Agent: No late fee...] │ │ $50k seems low for │ │ │ │ │ this deal size │ │ └──────────────────────────────┘ │ │ │ │ 2. No late payment │ │ ┌─ Comments Sidebar ──────────┐ │ clause specified │ │ │ 💬 Agent: Liability cap │ │ │ │ │ at $50k is low... │ │ I've added comments │ │ │ │ │ to both paragraphs. │ │ │ 💬 Agent: No late fee │ │ │ │ │ clause specified... │ │ [Apply suggested fix] │ │ └──────────────────────────────┘ │ │ │ │ User: Fix the first │ │ │ one, change to $500k │ │ │ │ │ │ Agent: Done. Created │ │ │ a tracked change: │ │ │ $50k → $500k │ └────────────────────────────────────────┴─────────────────────────┘
The agent's comments and tracked changes appear **instantly** in the editor — same as if a human collaborator added them. The existing `CommentsSidebar` renders them. The user can accept/reject tracked changes through the normal UI.
## Architecture
### Three layers
┌──────────────────────────────────────────────────────────────┐ │ 1. CHAT UI (React component) │ │ - Message list, input box, tool call display │ │ - Lives in packages/react │ │ - Pure presentation — no AI logic │ └────────────┬─────────────────────────────────────────────────┘ │ calls ┌────────────▼─────────────────────────────────────────────────┐ │ 2. AGENT TOOLS (tool definitions + handlers) │ │ - Tool schemas the AI can call │ │ - Handlers that call into EditorBridge │ │ - Lives in packages/agents │ └────────────┬─────────────────────────────────────────────────┘ │ calls ┌────────────▼─────────────────────────────────────────────────┐ │ 3. EDITOR BRIDGE (client-side adapter) │ │ - Connects agent tools → live editor state │ │ - Reads from ProseMirror doc + Document model │ │ - Writes comments/changes into editor state │ │ - Lives in packages/agents/bridge + packages/react │ └──────────────────────────────────────────────────────────────┘
### Key constraint: AI-provider agnostic
The spec defines **tool schemas and a bridge API**. It does NOT include any AI SDK, API calls, or model-specific logic. The consumer (app developer) brings their own AI provider and wires tool calls through the bridge.
This means the chat component receives messages and tool results as props — it doesn't make API calls itself.
---
## Layer 3: Editor Bridge (`packages/agents/src/bridge.ts`)
The bridge connects agent tool handlers to the live editor. It wraps a `DocxEditorRef` and exposes the same operations as `DocxReviewer`, but operating on the **live editor state** instead of a static Document.
### Interface
```ts
// packages/agents/src/bridge.ts
import type { DocxEditorRef } from '@eigenpal/docx-editor-react';
export interface EditorBridge {
// ── READ ──────────────────────────────────────────────────
/** Get document content as indexed text lines (same format as DocxReviewer.getContentAsText) */
getContentAsText(options?: GetContentOptions): string;
/** Get structured content blocks */
getContent(options?: GetContentOptions): ContentBlock[];
/** Get existing comments */
getComments(): ReviewComment[];
/** Get existing tracked changes */
getChanges(): ReviewChange[];
/** Get text around the user's current cursor/selection */
getSelectionContext(): SelectionContext | null;
// ── COMMENT ───────────────────────────────────────────────
/** Add a comment anchored to a paragraph (optionally to specific text within it) */
addComment(options: AddCommentOptions): number;
/** Reply to an existing comment */
replyTo(commentId: number, options: ReplyOptions): number;
/** Resolve a comment */
resolveComment(commentId: number): void;
// ── SUGGEST CHANGES ───────────────────────────────────────
/** Replace text, creating a tracked change visible in the editor */
replace(options: ProposeReplacementOptions): void;
/** Insert text as a tracked change */
proposeInsertion(options: ProposeInsertionOptions): void;
/** Delete text as a tracked change */
proposeDeletion(options: ProposeDeletionOptions): void;
// ── HIGHLIGHT ─────────────────────────────────────────────
/** Temporarily highlight a paragraph or text range (visual only, not persisted) */
highlight(paragraphIndex: number, options?: HighlightOptions): HighlightHandle;
// ── NAVIGATE ──────────────────────────────────────────────
/** Scroll to and optionally select a paragraph */
scrollTo(paragraphIndex: number): void;
}
export interface SelectionContext {
/** Currently selected text (empty string if cursor only) */
selectedText: string;
/** Paragraph index of the selection start */
paragraphIndex: number;
/** Full text of the paragraph containing the selection */
paragraphText: string;
/** Formatting at the selection */
formatting: TextFormatting;
}
export interface HighlightOptions {
/** Color of the highlight. Default: 'yellow' */
color?: string;
/** Optional: highlight only this text within the paragraph */
search?: string;
/** Auto-remove after N milliseconds. Default: no auto-remove */
duration?: number;
}
export interface HighlightHandle {
/** Remove the highlight */
remove(): void;
}
/** Create a bridge from a DocxEditor ref */
export function createEditorBridge(editorRef: DocxEditorRef, author?: string): EditorBridge;
Implementation strategy
The bridge reads from the editor's internal state:
- Read operations: Extract content from the ProseMirror document (same logic as
DocxReviewerbut reading fromeditorRef.getDocument()or the live PM state) - Comment operations: Call
editorRef's existing comment APIs (already wired inDocxEditor.tsx—setComments,addCommenthandlers exist) - Change operations: Dispatch ProseMirror transactions that create tracked changes (insertion/deletion marks with author metadata)
- Highlight: Add a temporary decoration to the ProseMirror view (a
Decoration.inlineorDecoration.node— removed when the handle'sremove()is called) - Navigate: Use
editorRef.scrollToIndex(paragraphIndex)or dispatch a selection + scrollIntoView
What needs to be added to DocxEditorRef
The existing DocxEditorRef needs a few new methods:
interface DocxEditorRef {
// ... existing methods ...
/** Get the current Document model (already exists as getDocument()) */
getDocument(): Document;
/** Add a comment programmatically (needs to be exposed) */
addComment(options: {
paragraphIndex: number;
text: string;
author: string;
search?: string;
}): number;
/** Reply to a comment */
replyToComment(commentId: number, text: string, author: string): number;
/** Resolve a comment */
resolveComment(commentId: number): void;
/** Create a tracked change (replacement) */
proposeReplacement(options: {
paragraphIndex: number;
search: string;
replaceWith: string;
author: string;
}): void;
/** Add a temporary highlight decoration */
addHighlight(
paragraphIndex: number,
options?: { search?: string; color?: string }
): { remove(): void };
/** Scroll to a paragraph index */
scrollToIndex(paragraphIndex: number): void;
}
Layer 2: Agent Tool Definitions (packages/agents/src/tools/)
Tools are defined as JSON schemas (compatible with Anthropic, OpenAI, and Vercel AI SDK tool formats). Each tool has a schema + a handler function that calls into the EditorBridge.
Tool catalog
| Tool Name | Description | Parameters |
|---|---|---|
read_document |
Read document content as indexed text | { fromIndex?, toIndex? } |
read_selection |
Get text/context at the user's current cursor position | {} |
read_comments |
List all comments in the document | { author? } |
read_changes |
List all tracked changes | { author?, type? } |
add_comment |
Add a comment on a paragraph | { paragraphIndex, text, search? } |
reply_to_comment |
Reply to an existing comment | { commentId, text } |
resolve_comment |
Mark a comment as resolved | { commentId } |
suggest_replacement |
Replace text (creates tracked change) | { paragraphIndex, search, replaceWith } |
suggest_insertion |
Insert text (creates tracked change) | { paragraphIndex, text, position?, search? } |
suggest_deletion |
Delete text (creates tracked change) | { paragraphIndex, search } |
highlight_text |
Temporarily highlight text to draw user attention | { paragraphIndex, search?, color?, duration? } |
scroll_to |
Scroll document to a paragraph | { paragraphIndex } |
Tool definition format
// packages/agents/src/tools/types.ts
export interface AgentToolDefinition<TInput = unknown> {
/** Tool name (used in tool_use blocks) */
name: string;
/** Human-readable description for the LLM */
description: string;
/** JSON Schema for the input parameters */
inputSchema: Record<string, unknown>;
/** Handler — receives parsed input + bridge, returns result for the LLM */
handler: (input: TInput, bridge: EditorBridge) => AgentToolResult;
}
export interface AgentToolResult {
/** Whether the operation succeeded */
success: boolean;
/** Data to return to the LLM (will be JSON.stringified) */
data?: unknown;
/** Error message if failed */
error?: string;
}
Example tool definition
// packages/agents/src/tools/readDocument.ts
export const readDocumentTool: AgentToolDefinition<{ fromIndex?: number; toIndex?: number }> = {
name: 'read_document',
description:
'Read the document content. Returns indexed text lines like "[0] First paragraph", ' +
'"[1] Second paragraph". Use fromIndex/toIndex to read a specific range. ' +
'Always read the document before commenting or suggesting changes.',
inputSchema: {
type: 'object',
properties: {
fromIndex: {
type: 'number',
description: 'Start reading from this paragraph index (inclusive). Default: 0',
},
toIndex: {
type: 'number',
description: 'Stop reading at this paragraph index (inclusive). Default: end of document',
},
},
},
handler: (input, bridge) => {
const text = bridge.getContentAsText({
fromIndex: input.fromIndex,
toIndex: input.toIndex,
});
return { success: true, data: text };
},
};
Registry + helpers
// packages/agents/src/tools/index.ts
/** All built-in tools */
export const agentTools: AgentToolDefinition[];
/** Get tool schemas in Anthropic format */
export function getAnthropicTools(): AnthropicToolSchema[];
/** Get tool schemas in OpenAI format */
export function getOpenAITools(): OpenAIToolSchema[];
/** Execute a tool call against an EditorBridge */
export function executeToolCall(
toolName: string,
input: unknown,
bridge: EditorBridge
): AgentToolResult;
Layer 1: Chat UI (packages/react/src/components/AgentChat/)
Components
AgentChat/
├── AgentChatPanel.tsx — Main panel (message list + input)
├── ChatMessage.tsx — Single message bubble
├── ChatToolCall.tsx — Inline tool call display (collapsible)
├── ChatInput.tsx — Text input + send button
├── types.ts — Chat message types
└── useAgentChat.ts — Hook that wires tools to the bridge
Props — Provider-agnostic
// AgentChatPanel.tsx
export interface AgentChatPanelProps {
/** Messages to display */
messages: ChatMessage[];
/** Whether the agent is currently generating */
isLoading?: boolean;
/** Called when the user sends a message. The consumer handles AI calls. */
onSendMessage: (text: string) => void;
/** Called when a tool call needs execution. Returns the result. */
onToolCall?: (toolName: string, input: unknown) => Promise<AgentToolResult>;
/** Optional: pre-built bridge for automatic tool execution */
bridge?: EditorBridge;
/** Agent display name. Default: 'Agent' */
agentName?: string;
/** Width of the panel. Default: 360px */
width?: number;
/** Whether the panel is open */
isOpen: boolean;
/** Called when the user closes the panel */
onClose: () => void;
}
Message types
// types.ts
export type ChatMessage = UserMessage | AgentMessage | ToolCallMessage | ToolResultMessage;
export interface UserMessage {
role: 'user';
id: string;
content: string;
timestamp: number;
}
export interface AgentMessage {
role: 'agent';
id: string;
content: string;
timestamp: number;
}
export interface ToolCallMessage {
role: 'tool_call';
id: string;
toolName: string;
input: unknown;
timestamp: number;
}
export interface ToolResultMessage {
role: 'tool_result';
id: string;
toolCallId: string;
result: AgentToolResult;
timestamp: number;
}
useAgentChat hook
Convenience hook that wires everything together:
export function useAgentChat(options: {
editorRef: React.RefObject<DocxEditorRef>;
author?: string;
}): {
/** The bridge instance (stable ref) */
bridge: EditorBridge;
/** Execute a tool call through the bridge */
executeToolCall: (toolName: string, input: unknown) => AgentToolResult;
/** Get tool schemas for your AI provider */
getToolSchemas: () => AgentToolDefinition[];
/** System prompt snippet describing the document context */
getSystemContext: () => string;
};
Chat UI behavior
- Tool calls: When the agent response includes tool calls, they appear as collapsible cards in the chat. The card shows the tool name, a human-readable summary of what it did, and the result (collapsed by default).
- Comments: When
add_commentis called, a comment appears instantly in theCommentsSidebar. The chat shows "Added comment on paragraph 5" with a clickable link that scrolls to the paragraph. - Changes: When
suggest_replacementis called, a tracked change appears in the editor. The chat shows a mini-diff ("$50k → $500k"). - Highlights: When
highlight_textis called, the paragraph briefly glows in the editor to draw attention.
Integration Example (Consumer Code)
// Example: App using the editor + chat with Anthropic SDK
import { DocxEditor, type DocxEditorRef } from '@eigenpal/docx-editor-react';
import { AgentChatPanel, useAgentChat } from '@eigenpal/docx-editor-react/ui';
import Anthropic from '@anthropic-ai/sdk';
function App() {
const editorRef = useRef<DocxEditorRef>(null);
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [isLoading, setIsLoading] = useState(false);
const [chatOpen, setChatOpen] = useState(true);
const { bridge, executeToolCall, getToolSchemas, getSystemContext } = useAgentChat({
editorRef,
author: 'Claude',
});
const handleSendMessage = async (text: string) => {
// Add user message
setMessages((prev) => [
...prev,
{ role: 'user', id: nanoid(), content: text, timestamp: Date.now() },
]);
setIsLoading(true);
// Call your AI provider
const client = new Anthropic();
let response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
system: `You are a document review assistant. ${getSystemContext()}`,
messages: messages.map((m) => ({
role: m.role === 'user' ? 'user' : 'assistant',
content: m.content,
})),
tools: getToolSchemas(), // ← tools from the bridge
});
// Handle tool calls in a loop
while (response.stop_reason === 'tool_use') {
for (const block of response.content) {
if (block.type === 'tool_use') {
const result = executeToolCall(block.name, block.input);
setMessages((prev) => [
...prev,
{
role: 'tool_call',
id: block.id,
toolName: block.name,
input: block.input,
timestamp: Date.now(),
},
{
role: 'tool_result',
id: nanoid(),
toolCallId: block.id,
result,
timestamp: Date.now(),
},
]);
}
}
// Continue the conversation with tool results
response = await client.messages.create({
/* ... */
});
}
// Add final agent message
const textBlock = response.content.find((b) => b.type === 'text');
if (textBlock) {
setMessages((prev) => [
...prev,
{ role: 'agent', id: nanoid(), content: textBlock.text, timestamp: Date.now() },
]);
}
setIsLoading(false);
};
return (
<div style={{ display: 'flex' }}>
<DocxEditor ref={editorRef} documentBuffer={buffer} style={{ flex: 1 }} />
<AgentChatPanel
messages={messages}
isLoading={isLoading}
onSendMessage={handleSendMessage}
bridge={bridge}
isOpen={chatOpen}
onClose={() => setChatOpen(false)}
/>
</div>
);
}
Implementation Plan
Phase 1: Editor Bridge (packages/agents + packages/react)
Goal: Make createEditorBridge() work against a live DocxEditorRef.
-
Expose missing methods on
DocxEditorRef(packages/react)addComment(),replyToComment(),resolveComment()— wire existing comment state handlers to the refproposeReplacement()— dispatch PM transaction with tracked change marksaddHighlight()— add/remove ProseMirrorDecorationscrollToIndex()— scroll to paragraph by index
-
Implement
createEditorBridge()(packages/agents/bridge.ts)- Read ops: call
editorRef.getDocument()→ pass body to existingDocxReviewercontent/discovery functions - Write ops: call the new
DocxEditorRefmethods above - Selection: read from ProseMirror selection state
- Read ops: call
Phase 2: Tool Definitions (packages/agents)
Goal: Define all 12 tools with schemas and handlers.
- Create
src/tools/directory with one file per tool + index - Add format helpers —
getAnthropicTools(),getOpenAITools() - Add
executeToolCall()dispatcher - Tests — unit test each tool handler against a mock bridge
Phase 3: Chat UI (packages/react)
Goal: Ship the AgentChatPanel component and useAgentChat hook.
useAgentChathook — creates bridge from ref, exposes tool executionAgentChatPanel— message list, input, tool call cardsChatMessage/ChatToolCall— rendering components- Styling — scoped within
.ep-root, consistent with editor design
Phase 4: Polish
- System prompt builder —
getSystemContext()generates a prompt snippet with document summary, available tools, and instructions - Streaming support —
AgentChatPanelaccepts streaming text via astreamingContentprop - Documentation — README with integration examples for Anthropic, OpenAI, Vercel AI SDK
Scope Boundaries
In scope
- EditorBridge API connecting agent tools to live editor
- Tool definitions (schemas + handlers) — 12 tools
- Chat UI components (presentation only)
useAgentChathook- Format helpers for Anthropic/OpenAI tool schemas
Out of scope
- AI provider integration (consumer brings their own)
- Authentication / API key management
- Chat message persistence
- Multi-user / real-time collaboration
- Custom tool registration (v2)
- Voice input
- File attachment in chat
Open Questions
-
Should the bridge also support headless mode? Today
DocxRevieweris headless-only. The bridge is editor-only. Should there be a unified interface that works in both modes? (Probably yes —DocxReviewercould implementEditorBridgefor headless use, making tools portable.) -
Tool granularity: Is
read_documentsufficient or do we needread_paragraph(index)for large documents? (Probably addfromIndex/toIndexparams, which we already have.) -
Streaming tool calls: Some AI providers stream tool calls incrementally. Should the chat UI render tool calls as they stream in, or wait for completion? (Start with wait-for-completion, add streaming later.)
-
Highlight persistence: Should highlights survive document edits or be purely ephemeral? (Ephemeral — they're for drawing attention, not annotation.)