pi-remote-control/docs/PHASE-1-sidecar.md

224 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 1 — Sidecar Production-Ready
> **Status:** blocked on Phase 0 verdict.
> **Owners:** parallelisable across multiple agents — see task table.
> **Branch base:** `main` after Phase 0 merge. Feature branches per work
> stream (see `SYNC.md`).
> **Spec reference:** [`reference/SPEC-ios-app.md`](./reference/SPEC-ios-app.md) §4.
## Goal
The `pi-remote-control` extension is extended into a full sidecar that can
serve the iOS app. End state: a single Node process, started alongside pi
(or as a system service), that exposes a WebSocket API for:
- Stream attach/detach with reconnect.
- Send-keys input.
- Multi-session lifecycle (spawn, list, rename, kill).
- Snapshot, disk-buffered replay.
- State, slash-command-registry side-channel.
- QR-based pairing, bearer-token auth, self-signed TLS with pinning.
- Health endpoint.
After Phase 1 we can drive everything from `wscat` or a small Web UI.
The iOS app is **not** required to validate Phase 1.
## Acceptance Criteria
For each S-feature listed below: implemented, manually exercised, basic
test (smoke test minimum). Plus:
- `pi-remote pair` prints a working QR.
- Two parallel sessions can be spawned, switched between, and one can be
killed without disturbing the other.
- WebSocket-level integration smoke test: a script that opens a stream,
sends keys, receives output, drops the connection, reconnects with
`lastSeq`, observes a clean delta.
- `wss://` works against the self-signed cert; the fingerprint matches the
QR contents.
- Sidecar survives restart and reattaches to all existing tmux sessions
without losing state.
## Architecture Sketch
```
extensions/remote-control/
├── index.ts — extension entry point (existing, extended)
├── server/ — NEW: HTTP/WS server, split into route modules
│ ├── server.ts — bootstrap, TLS, middleware
│ ├── routes/
│ │ ├── stream.ts — S-02 binary stream + S-04 sequence + S-05 snapshot
│ │ ├── input.ts — S-03 send-keys
│ │ ├── sessions.ts — S-09 multi-session CRUD
│ │ ├── commands.ts — S-08 slash-command registry
│ │ ├── side.ts — S-07 state side-channel
│ │ └── health.ts — S-12 health
│ └── upgrade.ts — WS upgrade routing per session/topic
├── tmux/ — NEW: tmux wrapper
│ ├── manager.ts — spawn/list/kill, metadata via @options
│ ├── pipe.ts — pipe-pane, FIFO read, byte streaming
│ ├── input.ts — send-keys translation
│ └── snapshot.ts — capture-pane wrapper
├── buffer/ — NEW: disk ringbuffer per session
│ ├── writer.ts — append, cap enforcement, watchdog
│ └── reader.ts — range read for snapshot fallback
├── sequence.ts — NEW: monotonic chunk numbering shared by stream + buffer
├── auth/ — auth/pairing module
│ ├── tokens.ts — bearer-token CRUD (extends existing auth.ts)
│ ├── pairing.ts — pi-remote pair, QR rendering, exchange
│ └── tls.ts — self-signed cert generation + fingerprint
├── pi/ — adapter to pi ExtensionAPI
│ ├── events.ts — subscribe agent_start/end, tool_*, session_*
│ ├── commands.ts — pi.getCommands() wrapper
│ └── autoname.ts — S-09a, spawn pi -p subprocess
└── cli/ — CLI entrypoints (pi-remote attach/pair/auth/health)
└── index.ts
```
`html.ts`, `messages.ts`, the existing `server.ts` and `config.ts` remain
for the legacy HTML client during the transition; they are tagged as
*legacy* in code comments. They will be retired after Phase 2 ships.
## Task Breakdown
Tasks are numbered `T-1.<n>`. The "Parallel With" column shows which other
tasks can be in flight simultaneously without merge pain. The "Touches"
column lists the files an agent may modify.
| ID | Task | Touches | Depends on | Parallel With |
|---|---|---|---|---|
| T-1.0 | **Server refactor scaffold.** Carve `server.ts` into the `server/` and route modules above; existing HTML behaviour must still work; CI green. | `extensions/remote-control/server/**`, minimal edit of `index.ts` | — | none — must land first |
| T-1.1 | **tmux/manager + tmux/pipe + tmux/snapshot.** Spawn, list, kill, metadata via `@description`. Pipe-pane FIFO reader. Snapshot via `capture-pane`. | `tmux/**` | T-1.0 | T-1.2, T-1.3, T-1.4, T-1.5, T-1.6 |
| T-1.2 | **Sequence module + buffer/writer + buffer/reader.** Monotone chunk IDs, disk ringbuffer with caps (100MB/session, 1GB global, free-space watchdog), idle-cleanup. | `sequence.ts`, `buffer/**` | T-1.0 | T-1.1, T-1.3, T-1.4, T-1.5, T-1.6 |
| T-1.3 | **Auth: tokens + pairing + TLS.** Self-signed cert generation, fingerprint, bearer-token CRUD, `pi-remote pair` CLI + QR rendering, `pi-remote auth list/revoke/name`. | `auth/**`, `cli/index.ts` (subcommands only) | T-1.0 | T-1.1, T-1.2, T-1.4, T-1.5, T-1.6 |
| T-1.4 | **pi adapter.** Subscribe ExtensionAPI events, expose `getCommands`, implement `autoname.ts` spawning `pi -p`. | `pi/**`, edits in `index.ts` to wire subscriptions | T-1.0 | T-1.1, T-1.2, T-1.3, T-1.5, T-1.6 |
| T-1.5 | **Stream + input + snapshot routes (S-02/S-03/S-04/S-05).** WS upgrade routing, binary stream, sequence cursor resume, send-keys with bracketed-paste. | `server/routes/stream.ts`, `server/routes/input.ts`, `server/upgrade.ts` | T-1.0, T-1.1, T-1.2 | T-1.6, T-1.7 |
| T-1.6 | **Side-channel + commands + sessions routes (S-07/S-08/S-09).** | `server/routes/side.ts`, `server/routes/commands.ts`, `server/routes/sessions.ts` | T-1.0, T-1.1, T-1.4 | T-1.5, T-1.7 |
| T-1.7 | **Health endpoint + config + watchdog (S-12).** Disk watchdog ties buffer caps to global state. | `server/routes/health.ts`, new `config.toml` schema in `config.ts` | T-1.0, T-1.2 | T-1.5, T-1.6 |
| T-1.8 | **Integration smoke harness.** Node script under `scripts/smoke/` that spawns a sidecar, opens a stream, sends keys, drops + reconnects, verifies delta. | `scripts/smoke/**` | T-1.5, T-1.6 | none |
| T-1.9 | **Docs: operator guide.** README section "Running pi-remote as a sidecar", config sample, troubleshooting. | `README.md`, optionally `docs/reference/OPERATOR.md` | T-1.5, T-1.6, T-1.7 | parallel with T-1.8 |
| T-1.10 | **APNs scaffold (deferred but cheap).** `apns/` module: config schema, JWT generation, push primitive. Stub the device-token registry — flesh out in Phase 2 when iOS app provides tokens. | `apns/**`, edits in `auth/tokens.ts` to store device-tokens | T-1.3 | T-1.5..T-1.7 |
## Interface Contracts (lock early to enable parallelism)
These are the contracts that downstream tasks depend on. They must be
agreed and frozen at the start of Phase 1 — see `SYNC.md` for the freeze
protocol.
### IC-1 — WebSocket frames
```ts
// binary frame : raw ANSI stream bytes (output direction only).
// text frame : JSON, type-discriminated.
type ClientToServer =
| { type: "resume"; lastSeq: number | null }
| { type: "key"; name: string } // "escape" | "tab" | "up" | "down" | "left" | "right" | "enter" | "shift-enter"
| { type: "keys"; data: string } // literal text, sent via send-keys -l
| { type: "paste"; data: string } // wrapped in bracketed-paste
| { type: "snapshot-request" };
type ServerToClient =
| { type: "state"; value: "thinking" | "tool" | "idle" | "awaiting-input"; tool?: string; ts: number }
| { type: "tree"; nodes: TreeNode[]; current: string } // optional, read-only
| { type: "snapshot"; seq: number; data: string } // base64 ANSI snapshot
| { type: "session-meta"; name: string; description?: string; createdAt: string }
| { type: "error"; code: string; message: string };
```
Binary frames carry an out-of-band `seq` via a leading 8-byte
big-endian header. Owner: T-1.5.
### IC-2 — HTTP REST shape
```
GET /health → { ok, sessions, bufferBytes, ... }
POST /sessions → { id, name }
GET /sessions → [{ id, name, description, state, lastOutputAt }, …]
PATCH /sessions/:id → updates @description
DELETE /sessions/:id → kills tmux session, optionally clears buffer
GET /sessions/:id/commands → [{ name, description, args }]
GET /sessions/:id/thumbnail → text/plain capture-pane (40×12)
```
All endpoints behind bearer token, all responses `application/json` unless
noted. Owner: T-1.5..T-1.7.
### IC-3 — Pairing payload
QR encodes a `pi-remote://` URL:
```
pi-remote://<host>:<port>?pair=<pairing-token>&fp=<sha256-hex>&name=<sidecar-name>
```
Pairing exchange: client `POST /pair` with `{ pairingToken, deviceToken?, environment?, deviceName? }` → server replies `{ bearerToken, sidecarId }`. Owner: T-1.3.
### IC-4 — Config schema (TOML)
```toml
[server]
host = "0.0.0.0"
port = 7777
state_dir = "~/.local/share/pi-remote"
[buffer]
per_session_mb = 100
global_gb = 1
free_min_gb = 1
idle_days = 30
[tmux]
default_width = 120
default_height = 40
[apns]
team_id = "..."
key_id = "..."
key_path = "..."
bundle_id = "..."
[autoname]
enabled = true
trigger_after = 3 # user messages
model = "claude-haiku-4-5"
```
Owner: T-1.7.
## Branching Strategy
- Each task is a feature branch off `main`, named `feat/p1-<task-id>-<slug>`,
e.g. `feat/p1-t1-1-tmux-manager`.
- Open a PR as soon as a task is ready for review. Squash-merge.
- T-1.0 (refactor) lands first, then T-1.1..T-1.4 can run truly parallel.
- T-1.5..T-1.7 each consume one or more of the lower-layer modules; they
start as soon as the dependency PR is in `main`.
## Test Strategy
- Unit: per-module pure-logic tests under `extensions/remote-control/**/__tests__/`.
- Integration smoke: T-1.8 script, runnable locally and in CI.
- Manual: each task PR lists manual-verification steps.
- No iOS testing in this phase.
## Risks
- **R1.** Disk-buffer cap math races vs. global watchdog. Mitigation:
serialise buffer writes through a single async queue per session, lock
the global cap behind a mutex.
- **R2.** ExtensionAPI event names might shift in future pi versions.
Mitigation: pin pi version range in `package.json`, isolate adapter in
`pi/events.ts`.
- **R3.** `pi -p` auto-name calls cost money. Mitigation: gate behind
`[autoname] enabled`, debounce, skip if user already named the session.
## Exit / Handover
- All T-1.x merged.
- Smoke harness passes locally and in CI.
- Operator guide complete.
- A short `docs/reference/PHASE-1-report.md` summarising deviations from
the plan, especially anything that affects Phase 2 contracts.
- Update `SYNC.md` to unblock Phase 2.