12 KiB
Phase 1 — Sidecar Production-Ready
Status: blocked on Phase 0 verdict. Owners: parallelisable across multiple agents — see task table. Branch base:
mainafter Phase 0 merge. Feature branches per work stream (seeSYNC.md). Spec reference:reference/SPEC-ios-app.md§4.
Streaming primitive (decided in Phase 0.5): tmux control mode (
tmux -C attach). Pane output is delivered via parsed%outputevents, which is robust across alternate-screen transitions — unlikepipe-pane(which Phase 0 found unreliable). Seereference/PHASE-0.5-report.mdand the spike code in branchfeat/spike-tmux-cc.
Goal
The pi-remote-control extension is extended into a full sidecar that can
serve the iOS app. End state: a single Node process, started alongside pi
(or as a system service), that exposes a WebSocket API for:
- Stream attach/detach with reconnect.
- Send-keys input.
- Multi-session lifecycle (spawn, list, rename, kill).
- Snapshot, disk-buffered replay.
- State, slash-command-registry side-channel.
- QR-based pairing, bearer-token auth, self-signed TLS with pinning.
- Health endpoint.
After Phase 1 we can drive everything from wscat or a small Web UI.
The iOS app is not required to validate Phase 1.
Acceptance Criteria
For each S-feature listed below: implemented, manually exercised, basic test (smoke test minimum). Plus:
pi-remote pairprints a working QR.- Two parallel sessions can be spawned, switched between, and one can be killed without disturbing the other.
- WebSocket-level integration smoke test: a script that opens a stream,
sends keys, receives output, drops the connection, reconnects with
lastSeq, observes a clean delta. wss://works against the self-signed cert; the fingerprint matches the QR contents.- Sidecar survives restart and reattaches to all existing tmux sessions without losing state.
Architecture Sketch
extensions/remote-control/
├── index.ts — extension entry point (existing, extended)
├── server/ — NEW: HTTP/WS server, split into route modules
│ ├── server.ts — bootstrap, TLS, middleware
│ ├── routes/
│ │ ├── stream.ts — S-02 binary stream + S-04 sequence + S-05 snapshot
│ │ ├── input.ts — S-03 send-keys
│ │ ├── sessions.ts — S-09 multi-session CRUD
│ │ ├── commands.ts — S-08 slash-command registry
│ │ ├── side.ts — S-07 state side-channel
│ │ └── health.ts — S-12 health
│ └── upgrade.ts — WS upgrade routing per session/topic
├── tmux/ — NEW: tmux wrapper (control-mode client)
│ ├── manager.ts — spawn/list/kill sessions, metadata via @options
│ ├── control.ts — `tmux -C` control-mode client, %output parser, byte streaming
│ ├── input.ts — send-keys translation (key names → tmux send-keys)
│ └── snapshot.ts — capture-pane wrapper
├── buffer/ — NEW: disk ringbuffer per session
│ ├── writer.ts — append, cap enforcement, watchdog
│ └── reader.ts — range read for snapshot fallback
├── sequence.ts — NEW: monotonic chunk numbering shared by stream + buffer
├── auth/ — auth/pairing module
│ ├── tokens.ts — bearer-token CRUD (extends existing auth.ts)
│ ├── pairing.ts — pi-remote pair, QR rendering, exchange
│ └── tls.ts — self-signed cert generation + fingerprint
├── pi/ — adapter to pi ExtensionAPI
│ ├── events.ts — subscribe agent_start/end, tool_*, session_*
│ ├── commands.ts — pi.getCommands() wrapper
│ └── autoname.ts — S-09a, spawn pi -p subprocess
└── cli/ — CLI entrypoints (pi-remote attach/pair/auth/health)
└── index.ts
html.ts, messages.ts, the existing server.ts and config.ts remain
for the legacy HTML client during the transition; they are tagged as
legacy in code comments. They will be retired after Phase 2 ships.
Task Breakdown
Tasks are numbered T-1.<n>. The "Parallel With" column shows which other
tasks can be in flight simultaneously without merge pain. The "Touches"
column lists the files an agent may modify.
| ID | Task | Touches | Depends on | Parallel With |
|---|---|---|---|---|
| T-1.0 | Server refactor scaffold. Carve server.ts into the server/ and route modules above; existing HTML behaviour must still work; CI green. |
extensions/remote-control/server/**, minimal edit of index.ts |
— | none — must land first |
| T-1.1 | tmux/manager + tmux/control + tmux/snapshot. Spawn, list, kill, metadata via @description. Control-mode client (tmux -C attach), %output parser with octal-escape decoder, broadcast bytes to subscribers. Snapshot via capture-pane. Reference: feat/spike-tmux-cc branch (spike-cc.ts). |
tmux/** |
T-1.0 | T-1.2, T-1.3, T-1.4, T-1.5, T-1.6 |
| T-1.2 | Sequence module + buffer/writer + buffer/reader. Monotone chunk IDs, disk ringbuffer with caps (100MB/session, 1GB global, free-space watchdog), idle-cleanup. | sequence.ts, buffer/** |
T-1.0 | T-1.1, T-1.3, T-1.4, T-1.5, T-1.6 |
| T-1.3 | Auth: tokens + pairing + TLS. Self-signed cert generation, fingerprint, bearer-token CRUD, pi-remote pair CLI + QR rendering, pi-remote auth list/revoke/name. |
auth/**, cli/index.ts (subcommands only) |
T-1.0 | T-1.1, T-1.2, T-1.4, T-1.5, T-1.6 |
| T-1.4 | pi adapter. Subscribe ExtensionAPI events, expose getCommands, implement autoname.ts spawning pi -p. |
pi/**, edits in index.ts to wire subscriptions |
T-1.0 | T-1.1, T-1.2, T-1.3, T-1.5, T-1.6 |
| T-1.5 | Stream + input + snapshot routes (S-02/S-03/S-04/S-05). WS upgrade routing, binary stream, sequence cursor resume, send-keys with bracketed-paste. | server/routes/stream.ts, server/routes/input.ts, server/upgrade.ts |
T-1.0, T-1.1, T-1.2 | T-1.6, T-1.7 |
| T-1.6 | Side-channel + commands + sessions routes (S-07/S-08/S-09). | server/routes/side.ts, server/routes/commands.ts, server/routes/sessions.ts |
T-1.0, T-1.1, T-1.4 | T-1.5, T-1.7 |
| T-1.7 | Health endpoint + config + watchdog (S-12). Disk watchdog ties buffer caps to global state. | server/routes/health.ts, new config.toml schema in config.ts |
T-1.0, T-1.2 | T-1.5, T-1.6 |
| T-1.8 | Integration smoke harness. Node script under scripts/smoke/ that spawns a sidecar, opens a stream, sends keys, drops + reconnects, verifies delta. |
scripts/smoke/** |
T-1.5, T-1.6 | none |
| T-1.9 | Docs: operator guide. README section "Running pi-remote as a sidecar", config sample, troubleshooting. | README.md, optionally docs/reference/OPERATOR.md |
T-1.5, T-1.6, T-1.7 | parallel with T-1.8 |
| T-1.10 | APNs scaffold (deferred but cheap). apns/ module: config schema, JWT generation, push primitive. Stub the device-token registry — flesh out in Phase 2 when iOS app provides tokens. |
apns/**, edits in auth/tokens.ts to store device-tokens |
T-1.3 | T-1.5..T-1.7 |
Interface Contracts (lock early to enable parallelism)
These are the contracts that downstream tasks depend on. They must be
agreed and frozen at the start of Phase 1 — see SYNC.md for the freeze
protocol.
IC-1 — WebSocket frames
// binary frame : raw ANSI stream bytes (output direction only).
// text frame : JSON, type-discriminated.
type ClientToServer =
| { type: "resume"; lastSeq: number | null }
| { type: "key"; name: string } // "escape" | "tab" | "up" | "down" | "left" | "right" | "enter" | "shift-enter"
| { type: "keys"; data: string } // literal text, sent via send-keys -l
| { type: "paste"; data: string } // wrapped in bracketed-paste
| { type: "snapshot-request" };
type ServerToClient =
| { type: "state"; value: "thinking" | "tool" | "idle" | "awaiting-input"; tool?: string; ts: number }
// tree event dropped — out of iOS scope. Revisit if a dashboard wants it.
// resize ClientToServer deferred — fixed 120×40 for v1.
| { type: "snapshot"; seq: number; data: string } // base64 ANSI snapshot
| { type: "session-meta"; name: string; description?: string; createdAt: string }
| { type: "error"; code: string; message: string };
Binary frames carry an out-of-band seq via a leading 8-byte
big-endian header. Owner: T-1.5.
IC-2 — HTTP REST shape
GET /health → { ok, sessions, bufferBytes, ... }
POST /sessions → { id, name }
GET /sessions → [{ id, name, description, state, lastOutputAt }, …]
PATCH /sessions/:id → updates @description
DELETE /sessions/:id → kills tmux session, optionally clears buffer
GET /sessions/:id/commands → [{ name, description, args }]
GET /sessions/:id/thumbnail → text/plain capture-pane (40×12)
All endpoints behind bearer token, all responses application/json unless
noted. Owner: T-1.5..T-1.7.
IC-3 — Pairing payload
QR encodes a pi-remote:// URL:
pi-remote://<host>:<port>?pair=<pairing-token>&fp=<sha256-hex>&name=<sidecar-name>
Pairing exchange: client POST /pair with { pairingToken, deviceToken?, environment?, deviceName? } → server replies { bearerToken, sidecarId }.
deviceToken and environment are optional pre-Phase-2, mandatory from Phase 2 onward. Owner: T-1.3.
IC-4 — Config schema (TOML)
[server]
host = "0.0.0.0"
port = 7777
state_dir = "~/.local/share/pi-remote"
[buffer]
per_session_mb = 100
global_gb = 1
free_min_gb = 1
idle_days = 30
[tmux]
default_width = 120
default_height = 40
[apns]
team_id = "..."
key_id = "..."
key_path = "..."
bundle_id = "..."
[autoname]
enabled = true
trigger_after = 3 # user messages
model = "claude-haiku-4-5"
Owner: T-1.7.
Branching Strategy
- Each task is a feature branch off
main, namedfeat/p1-<task-id>-<slug>, e.g.feat/p1-t1-1-tmux-manager. - Open a PR as soon as a task is ready for review. Squash-merge.
- T-1.0 (refactor) lands first, then T-1.1..T-1.4 can run truly parallel.
- T-1.5..T-1.7 each consume one or more of the lower-layer modules; they
start as soon as the dependency PR is in
main.
Test Strategy
- Unit: per-module pure-logic tests under
extensions/remote-control/**/__tests__/. - Integration smoke: T-1.8 script, runnable locally and in CI.
- Manual: each task PR lists manual-verification steps.
- No iOS testing in this phase.
Risks
- R1. Disk-buffer cap math races vs. global watchdog. Mitigation: serialise buffer writes through a single async queue per session, lock the global cap behind a mutex.
- R2. ExtensionAPI event names might shift in future pi versions.
Mitigation: pin pi version range in
package.json, isolate adapter inpi/events.ts. - R3.
pi -pauto-name calls cost money. Mitigation: gate behind[autoname] enabled, debounce, skip if user already named the session. - R4. tmux control-mode protocol is text-framed; binary pane bytes are
octal-escaped (
\NNN). Parser must handle high-throughput bursts (~50fps during tool output). Mitigation: streaming line-parser with no full-buffer copies; per-line decode allocates only the escaped payload. Reference decode inspike-cc.ts. - R5. tmux version requirement. Control mode is stable from tmux 2.0;
modern features (e.g.
pane-diedevent) need 2.5+. Mitigation:tmux/manager.tscheckstmux -Vat startup, refuses to run on < 2.5 with a clear error.
Exit / Handover
- All T-1.x merged.
- Smoke harness passes locally and in CI.
- Operator guide complete.
- A short
docs/reference/PHASE-1-report.mdsummarising deviations from the plan, especially anything that affects Phase 2 contracts. - Update
SYNC.mdto unblock Phase 2.