240 lines
12 KiB
Markdown
240 lines
12 KiB
Markdown
# Phase 1 — Sidecar Production-Ready
|
||
|
||
> **Status:** blocked on Phase 0 verdict.
|
||
> **Owners:** parallelisable across multiple agents — see task table.
|
||
> **Branch base:** `main` after Phase 0 merge. Feature branches per work
|
||
> stream (see `SYNC.md`).
|
||
> **Spec reference:** [`reference/SPEC-ios-app.md`](./reference/SPEC-ios-app.md) §4.
|
||
|
||
> **Streaming primitive (decided in Phase 0.5):** tmux **control mode**
|
||
> (`tmux -C attach`). Pane output is delivered via parsed `%output`
|
||
> events, which is robust across alternate-screen transitions — unlike
|
||
> `pipe-pane` (which Phase 0 found unreliable). See
|
||
> `reference/PHASE-0.5-report.md` and the spike code in branch
|
||
> `feat/spike-tmux-cc`.
|
||
|
||
## Goal
|
||
|
||
The `pi-remote-control` extension is extended into a full sidecar that can
|
||
serve the iOS app. End state: a single Node process, started alongside pi
|
||
(or as a system service), that exposes a WebSocket API for:
|
||
|
||
- Stream attach/detach with reconnect.
|
||
- Send-keys input.
|
||
- Multi-session lifecycle (spawn, list, rename, kill).
|
||
- Snapshot, disk-buffered replay.
|
||
- State, slash-command-registry side-channel.
|
||
- QR-based pairing, bearer-token auth, self-signed TLS with pinning.
|
||
- Health endpoint.
|
||
|
||
After Phase 1 we can drive everything from `wscat` or a small Web UI.
|
||
The iOS app is **not** required to validate Phase 1.
|
||
|
||
## Acceptance Criteria
|
||
|
||
For each S-feature listed below: implemented, manually exercised, basic
|
||
test (smoke test minimum). Plus:
|
||
|
||
- `pi-remote pair` prints a working QR.
|
||
- Two parallel sessions can be spawned, switched between, and one can be
|
||
killed without disturbing the other.
|
||
- WebSocket-level integration smoke test: a script that opens a stream,
|
||
sends keys, receives output, drops the connection, reconnects with
|
||
`lastSeq`, observes a clean delta.
|
||
- `wss://` works against the self-signed cert; the fingerprint matches the
|
||
QR contents.
|
||
- Sidecar survives restart and reattaches to all existing tmux sessions
|
||
without losing state.
|
||
|
||
## Architecture Sketch
|
||
|
||
```
|
||
extensions/remote-control/
|
||
├── index.ts — extension entry point (existing, extended)
|
||
├── server/ — NEW: HTTP/WS server, split into route modules
|
||
│ ├── server.ts — bootstrap, TLS, middleware
|
||
│ ├── routes/
|
||
│ │ ├── stream.ts — S-02 binary stream + S-04 sequence + S-05 snapshot
|
||
│ │ ├── input.ts — S-03 send-keys
|
||
│ │ ├── sessions.ts — S-09 multi-session CRUD
|
||
│ │ ├── commands.ts — S-08 slash-command registry
|
||
│ │ ├── side.ts — S-07 state side-channel
|
||
│ │ └── health.ts — S-12 health
|
||
│ └── upgrade.ts — WS upgrade routing per session/topic
|
||
├── tmux/ — NEW: tmux wrapper (control-mode client)
|
||
│ ├── manager.ts — spawn/list/kill sessions, metadata via @options
|
||
│ ├── control.ts — `tmux -C` control-mode client, %output parser, byte streaming
|
||
│ ├── input.ts — send-keys translation (key names → tmux send-keys)
|
||
│ └── snapshot.ts — capture-pane wrapper
|
||
├── buffer/ — NEW: disk ringbuffer per session
|
||
│ ├── writer.ts — append, cap enforcement, watchdog
|
||
│ └── reader.ts — range read for snapshot fallback
|
||
├── sequence.ts — NEW: monotonic chunk numbering shared by stream + buffer
|
||
├── auth/ — auth/pairing module
|
||
│ ├── tokens.ts — bearer-token CRUD (extends existing auth.ts)
|
||
│ ├── pairing.ts — pi-remote pair, QR rendering, exchange
|
||
│ └── tls.ts — self-signed cert generation + fingerprint
|
||
├── pi/ — adapter to pi ExtensionAPI
|
||
│ ├── events.ts — subscribe agent_start/end, tool_*, session_*
|
||
│ ├── commands.ts — pi.getCommands() wrapper
|
||
│ └── autoname.ts — S-09a, spawn pi -p subprocess
|
||
└── cli/ — CLI entrypoints (pi-remote attach/pair/auth/health)
|
||
└── index.ts
|
||
```
|
||
|
||
`html.ts`, `messages.ts`, the existing `server.ts` and `config.ts` remain
|
||
for the legacy HTML client during the transition; they are tagged as
|
||
*legacy* in code comments. They will be retired after Phase 2 ships.
|
||
|
||
## Task Breakdown
|
||
|
||
Tasks are numbered `T-1.<n>`. The "Parallel With" column shows which other
|
||
tasks can be in flight simultaneously without merge pain. The "Touches"
|
||
column lists the files an agent may modify.
|
||
|
||
| ID | Task | Touches | Depends on | Parallel With |
|
||
|---|---|---|---|---|
|
||
| T-1.0 | **Server refactor scaffold.** Carve `server.ts` into the `server/` and route modules above; existing HTML behaviour must still work; CI green. | `extensions/remote-control/server/**`, minimal edit of `index.ts` | — | none — must land first |
|
||
| T-1.1 | **tmux/manager + tmux/control + tmux/snapshot.** Spawn, list, kill, metadata via `@description`. **Control-mode client** (`tmux -C attach`), `%output` parser with octal-escape decoder, broadcast bytes to subscribers. Snapshot via `capture-pane`. Reference: `feat/spike-tmux-cc` branch (`spike-cc.ts`). | `tmux/**` | T-1.0 | T-1.2, T-1.3, T-1.4, T-1.5, T-1.6 |
|
||
| T-1.2 | **Sequence module + buffer/writer + buffer/reader.** Monotone chunk IDs, disk ringbuffer with caps (100MB/session, 1GB global, free-space watchdog), idle-cleanup. | `sequence.ts`, `buffer/**` | T-1.0 | T-1.1, T-1.3, T-1.4, T-1.5, T-1.6 |
|
||
| T-1.3 | **Auth: tokens + pairing + TLS.** Self-signed cert generation, fingerprint, bearer-token CRUD, `pi-remote pair` CLI + QR rendering, `pi-remote auth list/revoke/name`. | `auth/**`, `cli/index.ts` (subcommands only) | T-1.0 | T-1.1, T-1.2, T-1.4, T-1.5, T-1.6 |
|
||
| T-1.4 | **pi adapter.** Subscribe ExtensionAPI events, expose `getCommands`, implement `autoname.ts` spawning `pi -p`. | `pi/**`, edits in `index.ts` to wire subscriptions | T-1.0 | T-1.1, T-1.2, T-1.3, T-1.5, T-1.6 |
|
||
| T-1.5 | **Stream + input + snapshot routes (S-02/S-03/S-04/S-05).** WS upgrade routing, binary stream, sequence cursor resume, send-keys with bracketed-paste. | `server/routes/stream.ts`, `server/routes/input.ts`, `server/upgrade.ts` | T-1.0, T-1.1, T-1.2 | T-1.6, T-1.7 |
|
||
| T-1.6 | **Side-channel + commands + sessions routes (S-07/S-08/S-09).** | `server/routes/side.ts`, `server/routes/commands.ts`, `server/routes/sessions.ts` | T-1.0, T-1.1, T-1.4 | T-1.5, T-1.7 |
|
||
| T-1.7 | **Health endpoint + config + watchdog (S-12).** Disk watchdog ties buffer caps to global state. | `server/routes/health.ts`, new `config.toml` schema in `config.ts` | T-1.0, T-1.2 | T-1.5, T-1.6 |
|
||
| T-1.8 | **Integration smoke harness.** Node script under `scripts/smoke/` that spawns a sidecar, opens a stream, sends keys, drops + reconnects, verifies delta. | `scripts/smoke/**` | T-1.5, T-1.6 | none |
|
||
| T-1.9 | **Docs: operator guide.** README section "Running pi-remote as a sidecar", config sample, troubleshooting. | `README.md`, optionally `docs/reference/OPERATOR.md` | T-1.5, T-1.6, T-1.7 | parallel with T-1.8 |
|
||
| T-1.10 | **APNs scaffold (deferred but cheap).** `apns/` module: config schema, JWT generation, push primitive. Stub the device-token registry — flesh out in Phase 2 when iOS app provides tokens. | `apns/**`, edits in `auth/tokens.ts` to store device-tokens | T-1.3 | T-1.5..T-1.7 |
|
||
|
||
## Interface Contracts (lock early to enable parallelism)
|
||
|
||
These are the contracts that downstream tasks depend on. They must be
|
||
agreed and frozen at the start of Phase 1 — see `SYNC.md` for the freeze
|
||
protocol.
|
||
|
||
### IC-1 — WebSocket frames
|
||
|
||
```ts
|
||
// binary frame : raw ANSI stream bytes (output direction only).
|
||
// text frame : JSON, type-discriminated.
|
||
|
||
type ClientToServer =
|
||
| { type: "resume"; lastSeq: number | null }
|
||
| { type: "key"; name: string } // "escape" | "tab" | "up" | "down" | "left" | "right" | "enter" | "shift-enter"
|
||
| { type: "keys"; data: string } // literal text, sent via send-keys -l
|
||
| { type: "paste"; data: string } // wrapped in bracketed-paste
|
||
| { type: "snapshot-request" };
|
||
|
||
type ServerToClient =
|
||
| { type: "state"; value: "thinking" | "tool" | "idle" | "awaiting-input"; tool?: string; ts: number }
|
||
| { type: "tree"; nodes: TreeNode[]; current: string } // optional, read-only
|
||
| { type: "snapshot"; seq: number; data: string } // base64 ANSI snapshot
|
||
| { type: "session-meta"; name: string; description?: string; createdAt: string }
|
||
| { type: "error"; code: string; message: string };
|
||
```
|
||
|
||
Binary frames carry an out-of-band `seq` via a leading 8-byte
|
||
big-endian header. Owner: T-1.5.
|
||
|
||
### IC-2 — HTTP REST shape
|
||
|
||
```
|
||
GET /health → { ok, sessions, bufferBytes, ... }
|
||
POST /sessions → { id, name }
|
||
GET /sessions → [{ id, name, description, state, lastOutputAt }, …]
|
||
PATCH /sessions/:id → updates @description
|
||
DELETE /sessions/:id → kills tmux session, optionally clears buffer
|
||
GET /sessions/:id/commands → [{ name, description, args }]
|
||
GET /sessions/:id/thumbnail → text/plain capture-pane (40×12)
|
||
```
|
||
|
||
All endpoints behind bearer token, all responses `application/json` unless
|
||
noted. Owner: T-1.5..T-1.7.
|
||
|
||
### IC-3 — Pairing payload
|
||
|
||
QR encodes a `pi-remote://` URL:
|
||
|
||
```
|
||
pi-remote://<host>:<port>?pair=<pairing-token>&fp=<sha256-hex>&name=<sidecar-name>
|
||
```
|
||
|
||
Pairing exchange: client `POST /pair` with `{ pairingToken, deviceToken?, environment?, deviceName? }` → server replies `{ bearerToken, sidecarId }`. Owner: T-1.3.
|
||
|
||
### IC-4 — Config schema (TOML)
|
||
|
||
```toml
|
||
[server]
|
||
host = "0.0.0.0"
|
||
port = 7777
|
||
state_dir = "~/.local/share/pi-remote"
|
||
|
||
[buffer]
|
||
per_session_mb = 100
|
||
global_gb = 1
|
||
free_min_gb = 1
|
||
idle_days = 30
|
||
|
||
[tmux]
|
||
default_width = 120
|
||
default_height = 40
|
||
|
||
[apns]
|
||
team_id = "..."
|
||
key_id = "..."
|
||
key_path = "..."
|
||
bundle_id = "..."
|
||
|
||
[autoname]
|
||
enabled = true
|
||
trigger_after = 3 # user messages
|
||
model = "claude-haiku-4-5"
|
||
```
|
||
|
||
Owner: T-1.7.
|
||
|
||
## Branching Strategy
|
||
|
||
- Each task is a feature branch off `main`, named `feat/p1-<task-id>-<slug>`,
|
||
e.g. `feat/p1-t1-1-tmux-manager`.
|
||
- Open a PR as soon as a task is ready for review. Squash-merge.
|
||
- T-1.0 (refactor) lands first, then T-1.1..T-1.4 can run truly parallel.
|
||
- T-1.5..T-1.7 each consume one or more of the lower-layer modules; they
|
||
start as soon as the dependency PR is in `main`.
|
||
|
||
## Test Strategy
|
||
|
||
- Unit: per-module pure-logic tests under `extensions/remote-control/**/__tests__/`.
|
||
- Integration smoke: T-1.8 script, runnable locally and in CI.
|
||
- Manual: each task PR lists manual-verification steps.
|
||
- No iOS testing in this phase.
|
||
|
||
## Risks
|
||
|
||
- **R1.** Disk-buffer cap math races vs. global watchdog. Mitigation:
|
||
serialise buffer writes through a single async queue per session, lock
|
||
the global cap behind a mutex.
|
||
- **R2.** ExtensionAPI event names might shift in future pi versions.
|
||
Mitigation: pin pi version range in `package.json`, isolate adapter in
|
||
`pi/events.ts`.
|
||
- **R3.** `pi -p` auto-name calls cost money. Mitigation: gate behind
|
||
`[autoname] enabled`, debounce, skip if user already named the session.
|
||
- **R4.** tmux control-mode protocol is text-framed; binary pane bytes are
|
||
octal-escaped (`\NNN`). Parser must handle high-throughput bursts (~50fps
|
||
during tool output). Mitigation: streaming line-parser with no full-buffer
|
||
copies; per-line decode allocates only the escaped payload. Reference
|
||
decode in `spike-cc.ts`.
|
||
- **R5.** tmux version requirement. Control mode is stable from tmux 2.0;
|
||
modern features (e.g. `pane-died` event) need 2.5+. Mitigation:
|
||
`tmux/manager.ts` checks `tmux -V` at startup, refuses to run on < 2.5
|
||
with a clear error.
|
||
|
||
## Exit / Handover
|
||
|
||
- All T-1.x merged.
|
||
- Smoke harness passes locally and in CI.
|
||
- Operator guide complete.
|
||
- A short `docs/reference/PHASE-1-report.md` summarising deviations from
|
||
the plan, especially anything that affects Phase 2 contracts.
|
||
- Update `SYNC.md` to unblock Phase 2.
|