From f6cbf1707841b66301de863e3d08274cb1833c4c Mon Sep 17 00:00:00 2001 From: jay Date: Fri, 15 May 2026 03:33:51 +0200 Subject: [PATCH] =?UTF-8?q?docs:=20reorganise=20=E2=80=94=20implementation?= =?UTF-8?q?=20plans=20+=20sync,=20archive=20spec=20to=20reference/?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - docs/ now holds only implementation drivers: - PHASE-0-spike-stream.md (single agent, ~1 day PoC) - PHASE-1-sidecar.md (multi-agent, sidecar production-ready) - PHASE-2-ios-mvp.md (multi-agent, iOS app MVP) - PHASE-3-ios-augmentation.md (multi-agent, iOS polish) - SYNC.md (live coordination: claims, ownership, CCRs) - README.md (folder guide) - docs/reference/ holds background: - SPEC-ios-app.md (final v3 spec) - EXTENSION-API-AUDIT.md (audit result) - SPEC-ios-app-review-v1.md (archived review thread) - ARCHITECTURE.md (original extension architecture) - README.md (folder guide) Each phase plan defines: goal, acceptance criteria, file layout, explicit task table with parallelisable tasks, interface contracts, branching strategy, risks, exit criteria. SYNC.md provides the multi-agent coordination protocol: active claims table, file ownership map, frozen contracts, and contract change requests (CCR) workflow. --- docs/PHASE-0-spike-stream.md | 128 ++++++++++ docs/PHASE-1-sidecar.md | 223 ++++++++++++++++++ docs/PHASE-2-ios-mvp.md | 185 +++++++++++++++ docs/PHASE-3-ios-augmentation.md | 81 +++++++ docs/README.md | 22 ++ docs/SYNC.md | 158 +++++++++++++ docs/{ => reference}/ARCHITECTURE.md | 0 docs/{ => reference}/EXTENSION-API-AUDIT.md | 0 docs/reference/README.md | 11 + .../{ => reference}/SPEC-ios-app-review-v1.md | 0 docs/{ => reference}/SPEC-ios-app.md | 0 11 files changed, 808 insertions(+) create mode 100644 docs/PHASE-0-spike-stream.md create mode 100644 docs/PHASE-1-sidecar.md create mode 100644 docs/PHASE-2-ios-mvp.md create mode 100644 docs/PHASE-3-ios-augmentation.md create mode 100644 docs/README.md create mode 100644 docs/SYNC.md rename docs/{ => reference}/ARCHITECTURE.md (100%) rename docs/{ => reference}/EXTENSION-API-AUDIT.md (100%) create mode 100644 docs/reference/README.md rename docs/{ => reference}/SPEC-ios-app-review-v1.md (100%) rename docs/{ => reference}/SPEC-ios-app.md (100%) diff --git a/docs/PHASE-0-spike-stream.md b/docs/PHASE-0-spike-stream.md new file mode 100644 index 0000000..dffef05 --- /dev/null +++ b/docs/PHASE-0-spike-stream.md @@ -0,0 +1,128 @@ +# Phase 0 — Spike: tmux Stream PoC + +> **Status:** ready to start. +> **Owner:** single agent, end-to-end (too small to parallelise). +> **Branch:** `feat/spike-stream`. +> **Estimated effort:** ~1 day. + +## Goal + +Verify the foundational assumption of the entire spec: that we can run `pi` +inside `tmux`, tee the pane output via `pipe-pane`, push it as a binary +WebSocket stream, and consume it from a client without rendering artefacts +or unacceptable latency. + +Output is a decision: green light for Phase 1, or list of blockers that need +spec revision. + +## Acceptance Criteria + +- A new branch `feat/spike-stream` in `pi-remote-control`. +- A CLI invocation (e.g. `pi-remote spike`) that: + - Spawns a tmux session `pi-spike` running `pi`. + - Pipes the pane via `pipe-pane` to a WS endpoint on `ws://127.0.0.1:7799/spike`. + - Attaches the local terminal to the same tmux session. +- A test client (raw `wscat` script or a tiny throwaway HTML page) that + connects, dumps incoming binary frames to stdout (or a hex viewer) and + optionally re-renders them via `xterm.js`. +- A written PoC report `docs/reference/PHASE-0-report.md` answering: + - **R-1.** Does pi run cleanly inside tmux? (Ink redraws OK, no escape + sequence loss, no crashes during 10min uptime.) + - **R-2.** Does alternate-screen-buffer (`\e[?1049h`) work? Is the stream + parseable on the other side? + - **R-3.** Is per-chunk latency acceptable (< 50ms localhost, + < 200ms WAN)? + - **R-4.** Does the SSH session attached to the same tmux pane stay in + sync with the WS stream byte-for-byte? + - **R-5.** Edge cases observed (mouse mode, title sequences, very wide + output, etc.). + +## Out of Scope for Spike-0 + +- No authentication, no TLS — bind to 127.0.0.1 only. +- No reconnect, sequence numbers, snapshot or buffer. +- No `send-keys` direction (read-only stream is enough to verify rendering). +- No multi-session — one fixed `pi-spike` session. +- No iOS code. + +## Task Breakdown + +### T-0.1 — Branch + skeleton +Create `feat/spike-stream`. Add a new file +`extensions/remote-control/spike.ts` and a CLI entry (a new flag +`--spike` on the existing extension or a separate npm-script — +whichever is faster). + +### T-0.2 — tmux helper +Spawn tmux session, attach pipe-pane to a Unix FIFO or a pseudo-stream we +can read from Node. Reference command: + +```bash +tmux new-session -d -s pi-spike 'pi' +mkfifo /tmp/pi-spike.fifo +tmux pipe-pane -t pi-spike -o "cat > /tmp/pi-spike.fifo" +``` + +Node opens the FIFO read-side (`fs.createReadStream`) and exposes the byte +stream. + +### T-0.3 — WS server +Stand up a minimal `ws` server on port 7799, route `/spike`, send the FIFO +bytes as binary frames. No backpressure handling, no permessage-deflate yet. + +### T-0.4 — Test client +Two options, pick whichever is faster: +- **(a)** `wscat -b ws://127.0.0.1:7799/spike` and pipe through `od -c` for + raw inspection. +- **(b)** A 50-line HTML page with `xterm.js`, plain WebSocket, no styling. + +### T-0.5 — Attach + dual-render test +Open a second terminal, run `tmux attach -t pi-spike`. Type into pi. Verify +that what you see in the SSH attach is identical to what arrives on the WS +client. + +### T-0.6 — Stress / edge cases +Briefly try: +- Resize the SSH terminal — see how tmux/pi react. +- Run a slash command that opens a full-screen menu (alternate screen). +- Paste a multi-line block. +- Let pi do a long tool call. + +### T-0.7 — Report +Write `docs/reference/PHASE-0-report.md`. One paragraph per R-question, +plus a "go / no-go for Phase 1" verdict. + +## File Plan + +- New: `extensions/remote-control/spike.ts` +- New: `docs/reference/PHASE-0-report.md` +- Modified: `extensions/remote-control/index.ts` (add `--spike` flag or + separate entry). +- No changes to existing server.ts / html.ts / messages.ts. + +## Dependencies + +- `tmux` installed on the dev host. macOS: already present or + `brew install tmux`. +- `ws` library: already in `package.json`. +- `mkfifo` shell command (POSIX): already present on macOS/Linux. + +## Risks + +- **R-A.** Ink may refuse to run inside tmux due to TTY detection. If so, + set `FORCE_COLOR=1`, `TERM=xterm-256color`, pass `-tt` to tmux. Fall back + to spawning pi via `unbuffer` if necessary. +- **R-B.** FIFO can have buffering issues. If line buffering causes + visible lag, switch to a Unix domain socket and have Node read directly + from the socket. +- **R-C.** tmux's `pipe-pane` reproduces ANSI but may drop sequences during + bursts. If lossy, the alternative is to run pi inside our own `node-pty` + (a much larger change, but a fallback option). + +## Exit / Handover + +When Phase 0 closes: +- Merge `feat/spike-stream` into `main` only if PoC code is reusable for + Phase 1; otherwise close the branch and keep the report. +- Update `SYNC.md` with the verdict and any spec revisions needed. +- Trigger Phase 1. diff --git a/docs/PHASE-1-sidecar.md b/docs/PHASE-1-sidecar.md new file mode 100644 index 0000000..98fcaf6 --- /dev/null +++ b/docs/PHASE-1-sidecar.md @@ -0,0 +1,223 @@ +# Phase 1 — Sidecar Production-Ready + +> **Status:** blocked on Phase 0 verdict. +> **Owners:** parallelisable across multiple agents — see task table. +> **Branch base:** `main` after Phase 0 merge. Feature branches per work +> stream (see `SYNC.md`). +> **Spec reference:** [`reference/SPEC-ios-app.md`](./reference/SPEC-ios-app.md) §4. + +## Goal + +The `pi-remote-control` extension is extended into a full sidecar that can +serve the iOS app. End state: a single Node process, started alongside pi +(or as a system service), that exposes a WebSocket API for: + +- Stream attach/detach with reconnect. +- Send-keys input. +- Multi-session lifecycle (spawn, list, rename, kill). +- Snapshot, disk-buffered replay. +- State, slash-command-registry side-channel. +- QR-based pairing, bearer-token auth, self-signed TLS with pinning. +- Health endpoint. + +After Phase 1 we can drive everything from `wscat` or a small Web UI. +The iOS app is **not** required to validate Phase 1. + +## Acceptance Criteria + +For each S-feature listed below: implemented, manually exercised, basic +test (smoke test minimum). Plus: + +- `pi-remote pair` prints a working QR. +- Two parallel sessions can be spawned, switched between, and one can be + killed without disturbing the other. +- WebSocket-level integration smoke test: a script that opens a stream, + sends keys, receives output, drops the connection, reconnects with + `lastSeq`, observes a clean delta. +- `wss://` works against the self-signed cert; the fingerprint matches the + QR contents. +- Sidecar survives restart and reattaches to all existing tmux sessions + without losing state. + +## Architecture Sketch + +``` +extensions/remote-control/ +├── index.ts — extension entry point (existing, extended) +├── server/ — NEW: HTTP/WS server, split into route modules +│ ├── server.ts — bootstrap, TLS, middleware +│ ├── routes/ +│ │ ├── stream.ts — S-02 binary stream + S-04 sequence + S-05 snapshot +│ │ ├── input.ts — S-03 send-keys +│ │ ├── sessions.ts — S-09 multi-session CRUD +│ │ ├── commands.ts — S-08 slash-command registry +│ │ ├── side.ts — S-07 state side-channel +│ │ └── health.ts — S-12 health +│ └── upgrade.ts — WS upgrade routing per session/topic +├── tmux/ — NEW: tmux wrapper +│ ├── manager.ts — spawn/list/kill, metadata via @options +│ ├── pipe.ts — pipe-pane, FIFO read, byte streaming +│ ├── input.ts — send-keys translation +│ └── snapshot.ts — capture-pane wrapper +├── buffer/ — NEW: disk ringbuffer per session +│ ├── writer.ts — append, cap enforcement, watchdog +│ └── reader.ts — range read for snapshot fallback +├── sequence.ts — NEW: monotonic chunk numbering shared by stream + buffer +├── auth/ — auth/pairing module +│ ├── tokens.ts — bearer-token CRUD (extends existing auth.ts) +│ ├── pairing.ts — pi-remote pair, QR rendering, exchange +│ └── tls.ts — self-signed cert generation + fingerprint +├── pi/ — adapter to pi ExtensionAPI +│ ├── events.ts — subscribe agent_start/end, tool_*, session_* +│ ├── commands.ts — pi.getCommands() wrapper +│ └── autoname.ts — S-09a, spawn pi -p subprocess +└── cli/ — CLI entrypoints (pi-remote attach/pair/auth/health) + └── index.ts +``` + +`html.ts`, `messages.ts`, the existing `server.ts` and `config.ts` remain +for the legacy HTML client during the transition; they are tagged as +*legacy* in code comments. They will be retired after Phase 2 ships. + +## Task Breakdown + +Tasks are numbered `T-1.`. The "Parallel With" column shows which other +tasks can be in flight simultaneously without merge pain. The "Touches" +column lists the files an agent may modify. + +| ID | Task | Touches | Depends on | Parallel With | +|---|---|---|---|---| +| T-1.0 | **Server refactor scaffold.** Carve `server.ts` into the `server/` and route modules above; existing HTML behaviour must still work; CI green. | `extensions/remote-control/server/**`, minimal edit of `index.ts` | — | none — must land first | +| T-1.1 | **tmux/manager + tmux/pipe + tmux/snapshot.** Spawn, list, kill, metadata via `@description`. Pipe-pane FIFO reader. Snapshot via `capture-pane`. | `tmux/**` | T-1.0 | T-1.2, T-1.3, T-1.4, T-1.5, T-1.6 | +| T-1.2 | **Sequence module + buffer/writer + buffer/reader.** Monotone chunk IDs, disk ringbuffer with caps (100MB/session, 1GB global, free-space watchdog), idle-cleanup. | `sequence.ts`, `buffer/**` | T-1.0 | T-1.1, T-1.3, T-1.4, T-1.5, T-1.6 | +| T-1.3 | **Auth: tokens + pairing + TLS.** Self-signed cert generation, fingerprint, bearer-token CRUD, `pi-remote pair` CLI + QR rendering, `pi-remote auth list/revoke/name`. | `auth/**`, `cli/index.ts` (subcommands only) | T-1.0 | T-1.1, T-1.2, T-1.4, T-1.5, T-1.6 | +| T-1.4 | **pi adapter.** Subscribe ExtensionAPI events, expose `getCommands`, implement `autoname.ts` spawning `pi -p`. | `pi/**`, edits in `index.ts` to wire subscriptions | T-1.0 | T-1.1, T-1.2, T-1.3, T-1.5, T-1.6 | +| T-1.5 | **Stream + input + snapshot routes (S-02/S-03/S-04/S-05).** WS upgrade routing, binary stream, sequence cursor resume, send-keys with bracketed-paste. | `server/routes/stream.ts`, `server/routes/input.ts`, `server/upgrade.ts` | T-1.0, T-1.1, T-1.2 | T-1.6, T-1.7 | +| T-1.6 | **Side-channel + commands + sessions routes (S-07/S-08/S-09).** | `server/routes/side.ts`, `server/routes/commands.ts`, `server/routes/sessions.ts` | T-1.0, T-1.1, T-1.4 | T-1.5, T-1.7 | +| T-1.7 | **Health endpoint + config + watchdog (S-12).** Disk watchdog ties buffer caps to global state. | `server/routes/health.ts`, new `config.toml` schema in `config.ts` | T-1.0, T-1.2 | T-1.5, T-1.6 | +| T-1.8 | **Integration smoke harness.** Node script under `scripts/smoke/` that spawns a sidecar, opens a stream, sends keys, drops + reconnects, verifies delta. | `scripts/smoke/**` | T-1.5, T-1.6 | none | +| T-1.9 | **Docs: operator guide.** README section "Running pi-remote as a sidecar", config sample, troubleshooting. | `README.md`, optionally `docs/reference/OPERATOR.md` | T-1.5, T-1.6, T-1.7 | parallel with T-1.8 | +| T-1.10 | **APNs scaffold (deferred but cheap).** `apns/` module: config schema, JWT generation, push primitive. Stub the device-token registry — flesh out in Phase 2 when iOS app provides tokens. | `apns/**`, edits in `auth/tokens.ts` to store device-tokens | T-1.3 | T-1.5..T-1.7 | + +## Interface Contracts (lock early to enable parallelism) + +These are the contracts that downstream tasks depend on. They must be +agreed and frozen at the start of Phase 1 — see `SYNC.md` for the freeze +protocol. + +### IC-1 — WebSocket frames + +```ts +// binary frame : raw ANSI stream bytes (output direction only). +// text frame : JSON, type-discriminated. + +type ClientToServer = + | { type: "resume"; lastSeq: number | null } + | { type: "key"; name: string } // "escape" | "tab" | "up" | "down" | "left" | "right" | "enter" | "shift-enter" + | { type: "keys"; data: string } // literal text, sent via send-keys -l + | { type: "paste"; data: string } // wrapped in bracketed-paste + | { type: "snapshot-request" }; + +type ServerToClient = + | { type: "state"; value: "thinking" | "tool" | "idle" | "awaiting-input"; tool?: string; ts: number } + | { type: "tree"; nodes: TreeNode[]; current: string } // optional, read-only + | { type: "snapshot"; seq: number; data: string } // base64 ANSI snapshot + | { type: "session-meta"; name: string; description?: string; createdAt: string } + | { type: "error"; code: string; message: string }; +``` + +Binary frames carry an out-of-band `seq` via a leading 8-byte +big-endian header. Owner: T-1.5. + +### IC-2 — HTTP REST shape + +``` +GET /health → { ok, sessions, bufferBytes, ... } +POST /sessions → { id, name } +GET /sessions → [{ id, name, description, state, lastOutputAt }, …] +PATCH /sessions/:id → updates @description +DELETE /sessions/:id → kills tmux session, optionally clears buffer +GET /sessions/:id/commands → [{ name, description, args }] +GET /sessions/:id/thumbnail → text/plain capture-pane (40×12) +``` + +All endpoints behind bearer token, all responses `application/json` unless +noted. Owner: T-1.5..T-1.7. + +### IC-3 — Pairing payload + +QR encodes a `pi-remote://` URL: + +``` +pi-remote://:?pair=&fp=&name= +``` + +Pairing exchange: client `POST /pair` with `{ pairingToken, deviceToken?, environment?, deviceName? }` → server replies `{ bearerToken, sidecarId }`. Owner: T-1.3. + +### IC-4 — Config schema (TOML) + +```toml +[server] +host = "0.0.0.0" +port = 7777 +state_dir = "~/.local/share/pi-remote" + +[buffer] +per_session_mb = 100 +global_gb = 1 +free_min_gb = 1 +idle_days = 30 + +[tmux] +default_width = 120 +default_height = 40 + +[apns] +team_id = "..." +key_id = "..." +key_path = "..." +bundle_id = "..." + +[autoname] +enabled = true +trigger_after = 3 # user messages +model = "claude-haiku-4-5" +``` + +Owner: T-1.7. + +## Branching Strategy + +- Each task is a feature branch off `main`, named `feat/p1--`, + e.g. `feat/p1-t1-1-tmux-manager`. +- Open a PR as soon as a task is ready for review. Squash-merge. +- T-1.0 (refactor) lands first, then T-1.1..T-1.4 can run truly parallel. +- T-1.5..T-1.7 each consume one or more of the lower-layer modules; they + start as soon as the dependency PR is in `main`. + +## Test Strategy + +- Unit: per-module pure-logic tests under `extensions/remote-control/**/__tests__/`. +- Integration smoke: T-1.8 script, runnable locally and in CI. +- Manual: each task PR lists manual-verification steps. +- No iOS testing in this phase. + +## Risks + +- **R1.** Disk-buffer cap math races vs. global watchdog. Mitigation: + serialise buffer writes through a single async queue per session, lock + the global cap behind a mutex. +- **R2.** ExtensionAPI event names might shift in future pi versions. + Mitigation: pin pi version range in `package.json`, isolate adapter in + `pi/events.ts`. +- **R3.** `pi -p` auto-name calls cost money. Mitigation: gate behind + `[autoname] enabled`, debounce, skip if user already named the session. + +## Exit / Handover + +- All T-1.x merged. +- Smoke harness passes locally and in CI. +- Operator guide complete. +- A short `docs/reference/PHASE-1-report.md` summarising deviations from + the plan, especially anything that affects Phase 2 contracts. +- Update `SYNC.md` to unblock Phase 2. diff --git a/docs/PHASE-2-ios-mvp.md b/docs/PHASE-2-ios-mvp.md new file mode 100644 index 0000000..983a451 --- /dev/null +++ b/docs/PHASE-2-ios-mvp.md @@ -0,0 +1,185 @@ +# Phase 2 — iOS App MVP + +> **Status:** blocked on Phase 1 (sidecar must be reachable). +> **Owners:** parallelisable; see task table. +> **Repo:** new repository `pi-remote-ios` adjacent to `pi-remote-control`, +> at `git.vpsj.de/jay/pi-remote-ios`. Reason: Swift project, separate +> tooling, separate release cadence. +> **Spec reference:** [`reference/SPEC-ios-app.md`](./reference/SPEC-ios-app.md) §5 +> Groups A, B (sans hardware-keyboard), C-01, C-02, D-01 + a/b, E, F. + +## Goal + +A SwiftUI iOS app that: + +- Pairs with a sidecar via QR scan. +- Renders a single pi session 1:1 via SwiftTerm. +- Sends keystrokes back via the IC-1 protocol. +- Survives backgrounding and reconnect within the < 1s P-3 target. +- Switches between multiple sessions with pre-connect cache. +- Receives push notifications when pi reaches `awaiting-input`. + +After Phase 2 the app is usable in the user's daily workflow, replacing +the legacy HTML client. Augmentations (slash palette, voice, themes, +search, …) come in Phase 3. + +## Acceptance Criteria + +- Apple Developer enrolment complete, App ID with Push capability + APNs + Auth Key (`.p8`) generated. +- App builds and runs on the user's iPhone via Xcode (sandbox APNs). +- App pairs via QR, persists bearer token + cert pinning across launches. +- Foreground rendering: SwiftTerm shows pi 1:1, input round-trips. +- Background → foreground: < 1s to live stream, no visible empty screen. +- Three named sessions, switcher works, pre-connect makes switching + feel instant. +- Push notification fires when pi state transitions to `awaiting-input` + while app is backgrounded. +- TestFlight build distributable (production APNs route exercised). +- Face-ID gate available as opt-in setting. + +## Project Layout + +New repo `pi-remote-ios`: + +``` +pi-remote-ios/ +├── README.md +├── Package.swift — SwiftPM (deps: SwiftTerm, Starscream) +├── Apps/ +│ └── piRemote/ — main app target +│ ├── piRemoteApp.swift — @main entry +│ ├── Resources/ +│ │ ├── Themes/ — bundled .json theme files +│ │ ├── Fonts/ — JetBrains Mono, Hack, etc. +│ │ └── Assets.xcassets +│ └── Info.plist +├── Sources/ +│ ├── Core/ — networking, state, persistence +│ │ ├── Network/ +│ │ │ ├── WebSocketClient.swift — Starscream wrapper, permessage-deflate +│ │ │ ├── FrameCodec.swift — IC-1 encode/decode +│ │ │ ├── ResumeCursor.swift — lastSeq tracking per session +│ │ │ └── PinnedTrust.swift — TLS pinning from QR fingerprint +│ │ ├── Auth/ +│ │ │ ├── Keychain.swift +│ │ │ └── Pairing.swift — QR parse, exchange +│ │ ├── Sessions/ +│ │ │ ├── SessionRegistry.swift — list, spawn, kill (talks to /sessions) +│ │ │ ├── SessionConnection.swift — one WS per session +│ │ │ └── PreConnectPool.swift — D-01a strategy +│ │ ├── Push/ +│ │ │ ├── NotificationDelegate.swift +│ │ │ └── DeviceTokenRegistrar.swift — sends token + env to sidecar +│ │ └── Persistence/ +│ │ ├── ScrollbackCache.swift — rolling 5MB per session, on disk +│ │ └── Preferences.swift +│ ├── UI/ +│ │ ├── Terminal/ +│ │ │ ├── TerminalView.swift — UIViewRepresentable wrapping SwiftTerm +│ │ │ ├── ThemeStore.swift — bundled themes, currently selected +│ │ │ └── FontStore.swift +│ │ ├── Input/ +│ │ │ ├── ModifierBar.swift — [Ctrl][Esc][Tab][←↑↓→][⇧↵][🎙][📋] +│ │ │ ├── ModifierState.swift — sticky Ctrl + repeat handling +│ │ │ └── PasteSheet.swift — confirm-before-paste +│ │ ├── Status/ +│ │ │ └── StatusBar.swift — connection + pi state +│ │ ├── Sessions/ +│ │ │ ├── SessionSwitcher.swift — list, spawn, switch +│ │ │ └── SessionRow.swift — name + state badge (no thumbnail in MVP) +│ │ ├── Pairing/ +│ │ │ ├── QRScannerView.swift +│ │ │ └── PairingFlowView.swift +│ │ └── Settings/ +│ │ └── SettingsView.swift — Face-ID toggle, sidecar info +│ └── Voice/ — empty placeholder, populated in Phase 3 +├── Tests/ +│ └── CoreTests/ +│ ├── FrameCodecTests.swift +│ └── ResumeCursorTests.swift +└── docs/ + ├── BUILD.md + └── DISTRIBUTION.md — TestFlight steps +``` + +## Task Breakdown + +| ID | Task | Touches | Depends on | Parallel With | +|---|---|---|---|---| +| T-2.0 | **Repo + Xcode project scaffold + Apple Developer setup.** Create repo on git.vpsj.de, generate App ID + APNs Auth Key, commit `.p8` instructions (key itself stays out of git). Empty SwiftUI shell that boots and shows "Hello pi". | repo root, `Apps/piRemote/` | Phase 1 sidecar reachable | none — must land first | +| T-2.1 | **WebSocketClient + FrameCodec.** Starscream, permessage-deflate enabled, encode/decode IC-1 frames, basic ping/pong keepalive. Unit-tested. | `Sources/Core/Network/` | T-2.0 | T-2.2, T-2.3, T-2.4 | +| T-2.2 | **Pairing flow + Keychain + TLS pinning.** QR scanner (AVFoundation), parse `pi-remote://`, exchange via Pairing.swift, store bearer + fingerprint in Keychain, install PinnedTrust into URLSession + Starscream. | `Sources/Core/Auth/`, `Sources/Core/Network/PinnedTrust.swift`, `Sources/UI/Pairing/` | T-2.0 | T-2.1, T-2.3 | +| T-2.3 | **TerminalView + Theme/Font store.** Wrap SwiftTerm as UIViewRepresentable, render incoming binary chunks, expose Pinch-Zoom gesture (iOS-B-05), Selection/Copy (iOS-B-04). Bundle JetBrains Mono + Hack + default themes. | `Sources/UI/Terminal/`, `Apps/piRemote/Resources/` | T-2.0 | T-2.1, T-2.2, T-2.4 | +| T-2.4 | **ModifierBar + Input pipeline.** Layout `[Ctrl][Esc][Tab][←↑↓→][⇧↵][🎙][📋]`, sticky Ctrl, long-press repeat, paste sheet stub (full Smart-Paste in Phase 3). Wires keys into IC-1. | `Sources/UI/Input/` | T-2.1 | T-2.2, T-2.3 | +| T-2.5 | **SessionConnection + ResumeCursor + ScrollbackCache.** One WS per session, persist lastSeq, write incoming bytes into a rolling on-disk file per session. Snapshot fallback on gap. | `Sources/Core/Sessions/SessionConnection.swift`, `Sources/Core/Network/ResumeCursor.swift`, `Sources/Core/Persistence/ScrollbackCache.swift` | T-2.1 | T-2.6, T-2.7 | +| T-2.6 | **SessionRegistry + SessionSwitcher UI.** Talks to `/sessions`, list/spawn/rename/kill, switcher UI, basic SessionRow. No thumbnails or pre-connect yet. | `Sources/Core/Sessions/SessionRegistry.swift`, `Sources/UI/Sessions/` | T-2.1, T-2.5 | T-2.7 | +| T-2.7 | **PreConnectPool + Optimistic Switch + Stale-Frame.** All known sessions hold a hot WS + last frame; switching shows the cached frame instantly with a "syncing…" pill. | `Sources/Core/Sessions/PreConnectPool.swift`, `Sources/UI/Terminal/TerminalView.swift` (cache hooks) | T-2.5, T-2.6 | T-2.8 | +| T-2.8 | **StatusBar + side-channel consumption.** Subscribe to `state` frames, render `● thinking` / `▶ awaiting` / `⏸ idle`, session-name display. | `Sources/UI/Status/`, `Sources/Core/Sessions/SessionConnection.swift` (event surface) | T-2.5 | T-2.7 | +| T-2.9 | **Push: NotificationDelegate + DeviceTokenRegistrar.** Request user permission, register for remote notifications, ship `{ deviceToken, environment }` to sidecar at pair-time and on every launch. Foreground-handler suppresses banners when relevant session is visible. | `Sources/Core/Push/`, edits in pairing/Settings flow | T-2.2, Phase 1 T-1.10 | T-2.8 | +| T-2.10 | **Background lifecycle.** App-foreground triggers reconnect + delta pull, stale-frame freezes during sync, keep-alive ping in foreground only. | `Sources/Core/Sessions/SessionConnection.swift`, app delegate | T-2.5, T-2.7 | parallel with T-2.9 | +| T-2.11 | **Face-ID gate + Settings.** Opt-in toggle, gate appears on cold launch and on resume after > N seconds backgrounded. | `Sources/UI/Settings/`, `Sources/Core/Auth/Keychain.swift` | T-2.0 | parallel with most | +| T-2.12 | **TestFlight pipeline.** Build script, archive, upload, internal testers list. Verify production APNs path. | `docs/DISTRIBUTION.md`, Fastlane or shell scripts | T-2.0, T-2.9 | parallel with everything once T-2.0 is in | +| T-2.13 | **MVP smoke test.** Manual checklist run on the user's iPhone: pair → render → input → backgrounded → push → reopen < 1s → session-switch round-trip. Document any deviations. | `docs/PHASE-2-report.md` | all above | none | + +## Interface Contracts + +iOS consumes the IC-1..IC-4 contracts defined in Phase 1. Any deviation +discovered while building is fixed in the sidecar, not the app, and must +be communicated via `SYNC.md` (lock change protocol). + +Additional iOS-internal contract: + +### IC-2.1 — SessionConnection surface + +```swift +protocol SessionConnection { + var id: String { get } + var state: AnyPublisher { get } + var stream: AnyPublisher { get } // ANSI bytes, in order + func send(_ frame: ClientToServer) async throws + func resume(from lastSeq: UInt64?) async throws + func suspend() async // tear down WS but keep state +} +``` + +Owners of `SessionConnection`: T-2.5. Consumers: T-2.6, T-2.7, T-2.8, T-2.10. + +## Branching Strategy + +- All work on `pi-remote-ios`, off `main`. +- One branch per T-2.x task, `feat/p2--`. +- T-2.0 must land first. +- T-2.1, T-2.2, T-2.3, T-2.4, T-2.11, T-2.12 can start in parallel right + after T-2.0. +- T-2.5..T-2.10 form a dependency cluster but most can interleave. +- T-2.13 last. + +## Test Strategy + +- Unit tests for FrameCodec, ResumeCursor, theme parsing. +- UI snapshot tests for ModifierBar, SessionRow, StatusBar. +- Manual on-device testing via T-2.13 checklist. +- No XCUITest in MVP — too brittle for the time invested. + +## Risks + +- **R1.** Apple Developer enrolment delays. Workaround: dev with personal + team + free sideloading for the first 1-2 weeks; switch to paid account + before T-2.12. +- **R2.** Starscream's permessage-deflate compat with our `ws` library + needs verification with a smoke test early — block T-2.1 PR until + proven. +- **R3.** SwiftTerm's alternate-screen handling vs. our scrollback cache. + Cache must skip bytes while alternate-screen is active. Spec calls for + this; implementation needs care. +- **R4.** Push notification permission UX. If user declines, iOS-C-02 + degrades to silent. Provide a Settings deep-link to re-enable. + +## Exit / Handover + +- All T-2.x merged. +- T-2.13 report green. +- App on user's iPhone in daily use. +- `docs/PHASE-2-report.md` in this repo, summary mirrored into `SYNC.md`. +- Trigger Phase 3. diff --git a/docs/PHASE-3-ios-augmentation.md b/docs/PHASE-3-ios-augmentation.md new file mode 100644 index 0000000..e19bbe5 --- /dev/null +++ b/docs/PHASE-3-ios-augmentation.md @@ -0,0 +1,81 @@ +# Phase 3 — iOS Augmentation + +> **Status:** blocked on Phase 2 MVP shipping. +> **Owners:** highly parallelisable — features are largely independent. +> **Repo:** `pi-remote-ios`. +> **Spec reference:** [`reference/SPEC-ios-app.md`](./reference/SPEC-ios-app.md) §5 +> Groups B-06, B-07, B-08, B-09, C-03, C-04, C-05, D-01c, D-02, A-05 extensions. + +## Goal + +Make the iOS app distinctly nicer to use than a generic terminal client. +Each Phase 3 feature is independently shippable; no global blocker. +Features can land in any order, driven by daily use feedback. + +## Acceptance Criteria + +Per-feature checklist (each feature ships when its row passes): + +| Feature | Acceptance | +|---|---| +| Slash-Command Palette | Long-press modifier bar opens palette, fuzzy search works, command injects correctly, argument forms render for commands with args. | +| Voice-to-Prompt | Mic button → preview → send works offline (iOS Speech). | +| Predictive Thumbnails | Switcher list shows live 40×12 capture-pane previews refreshed on open. | +| Scrollback Search | Cmd-F (HW kb) or pull-down gesture opens search; jump-to-match highlights and centres. | +| Hardware Keyboard Shortcuts | Cmd-K, Cmd-T, Cmd-1..9, Cmd-F, Cmd-Shift-P, Cmd-, route correctly. | +| Reachability | iPhone landscape: modifier bar mirrored for one-handed use. | +| Smart Paste (full) | Clipboard preview chip, multi-line preview sheet, bracketed-paste correctness. | +| Haptic Feedback | Subtle haptic on thinking→idle and thinking→awaiting transitions. | +| Theme + Font Picker UI | Settings UI exposes all bundled themes and fonts; iCloud-sync for custom. | + +## Task Breakdown + +| ID | Task | Touches | Depends on | Parallel With | +|---|---|---|---|---| +| T-3.1 | **Slash-Command Palette** (iOS-C-04). Long-press recogniser on ModifierBar, palette sheet, fuzzy-search engine, argument form generator from JSON schema returned by sidecar. | `Sources/UI/Input/SlashPalette/`, sidecar `/sessions/:id/commands` already exists (S-08) | Phase 2 | all others | +| T-3.2 | **Voice-to-Prompt** (iOS-C-05). `Sources/Voice/`, SFSpeechRecognizer, microphone permission, preview-edit-send flow. | `Sources/Voice/`, `Sources/UI/Input/ModifierBar.swift` (🎙 wiring) | Phase 2 | all others | +| T-3.3 | **Predictive Thumbnails** (iOS-D-01c). Add `GET /sessions/:id/thumbnail` on sidecar if not already (Phase 1 IC-2 includes it); poll on switcher open; render small SwiftTerm in `SessionRow`. | `Sources/UI/Sessions/SessionRow.swift`, `Sources/Core/Sessions/SessionRegistry.swift` | Phase 2 | all others | +| T-3.4 | **Scrollback Search** (iOS-D-02). Search bar over `ScrollbackCache`, in-memory index (linear search is fine at 5MB), highlight + jump in TerminalView. | `Sources/UI/Terminal/Search/`, `Sources/Core/Persistence/ScrollbackCache.swift` (read API) | Phase 2 | all others | +| T-3.5 | **Hardware Keyboard Shortcuts** (iOS-B-06). Register `UIKeyCommand` set in piRemoteApp + scene; route to app actions. Caps→Esc opt-in. | `Apps/piRemote/piRemoteApp.swift`, individual view controllers via scene delegate | Phase 2 | T-3.1 (Cmd-Shift-P depends on slash palette existing) | +| T-3.6 | **Reachability / One-Hand-Mode** (iOS-B-07). Landscape layout in `ModifierBar` mirrored; settings toggle. | `Sources/UI/Input/ModifierBar.swift` | Phase 2 | all others | +| T-3.7 | **Smart Paste full** (iOS-B-08 + iOS-B-09 bracketed-paste). Extend stub PasteSheet from Phase 2 with multi-line preview, char/line counter; track `\e[?2004h/l` from stream, switch paste-frame type accordingly. | `Sources/UI/Input/PasteSheet.swift`, `Sources/Core/Sessions/SessionConnection.swift` (state tracker) | Phase 2 | all others | +| T-3.8 | **Haptic Feedback** (iOS-C-03). `UIImpactFeedbackGenerator` hook in StatusBar state change. Setting to disable. | `Sources/UI/Status/StatusBar.swift`, Settings | Phase 2 | all others | +| T-3.9 | **Theme + Font Picker** (iOS-A-05 UI). Settings panes for theme/font selection, custom-theme editor (JSON or color pickers), iCloud KVS sync for custom. | `Sources/UI/Settings/`, `Sources/UI/Terminal/ThemeStore.swift` extensions | Phase 2 | all others | + +## Inter-Task Conflicts + +Most Phase 3 tasks touch unrelated files. Watch zones: + +- **ModifierBar.swift** — T-3.1 (long-press), T-3.2 (mic), T-3.6 (mirror), + T-3.7 (paste). Coordinate via SYNC.md if more than one of these is in + flight simultaneously. Recommended order: T-3.1 → T-3.7 → T-3.6 → T-3.2. +- **SessionConnection.swift** — T-3.7 (bracketed-paste state) and any + follow-up to Phase 2 IC-2.1. Coordinate. +- **Settings UI** — T-3.6, T-3.8, T-3.9 all extend the same settings + surface. Land in series or merge carefully. + +## Test Strategy + +Per feature: a manual checklist row in `docs/PHASE-3-checklist.md`. No +heavy automation — these are visual / experiential features. + +Critical regressions to watch for: +- Slash palette must not break input flow when dismissed (focus return). +- Voice must not steal focus from the WebSocket stream. +- Thumbnails must not block switcher rendering on slow links. + +## Risks + +- **R1.** SFSpeechRecognizer offline accuracy varies. Mitigation: allow + on-device-only mode (slower, more private) vs. server-assisted toggle. +- **R2.** UIKeyCommand routing is finicky across scenes. Mitigation: + centralise key handling in a single `KeyCommandRouter` actor. +- **R3.** Custom theme JSON schema drift between iCloud devices. Mitigation: + versioned schema, migrate on read. + +## Exit + +Phase 3 has no hard exit — features land continuously. A "Phase 3 closed" +event is when every task above is shipped or explicitly deferred. At that +point write `docs/PHASE-3-report.md` summarising what made it, what didn't, +and ideas that came out of daily use for a future Phase 4. diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000..fc7ae08 --- /dev/null +++ b/docs/README.md @@ -0,0 +1,22 @@ +# Implementation Docs + +This folder drives the implementation work for the pi-remote iOS app and +its sidecar. Background / spec / audit material lives in +[`reference/`](./reference/). + +| File | Purpose | +|---|---| +| [`PHASE-0-spike-stream.md`](./PHASE-0-spike-stream.md) | Stream PoC — verify tmux + pipe-pane + WebSocket. ~1 day, single agent. | +| [`PHASE-1-sidecar.md`](./PHASE-1-sidecar.md) | Sidecar production-ready: all S-features, multi-agent parallel work. | +| [`PHASE-2-ios-mvp.md`](./PHASE-2-ios-mvp.md) | iOS app MVP — Groups A, B, C-01/02, D, E, F. Multi-agent parallel. | +| [`PHASE-3-ios-augmentation.md`](./PHASE-3-ios-augmentation.md) | iOS feature polish — slash palette, voice, thumbnails, search, etc. | +| [`SYNC.md`](./SYNC.md) | Live multi-agent coordination — claims, file ownership, contract changes. | + +## Order of work + +1. Phase 0 first, single agent. +2. Phase 1 starts after Phase 0 green-lights; multi-agent parallel. +3. Phase 2 starts after Phase 1 is production-ready; multi-agent parallel. +4. Phase 3 is continuous after Phase 2 MVP ships. + +See `SYNC.md` for the current state. diff --git a/docs/SYNC.md b/docs/SYNC.md new file mode 100644 index 0000000..3377af8 --- /dev/null +++ b/docs/SYNC.md @@ -0,0 +1,158 @@ +# SYNC — Multi-Agent Coordination + +> **Purpose:** allow several agents (human or AI) to work concurrently on +> this codebase without stepping on each other. +> +> **Scope:** all phases. This document is the live coordination surface; +> the phase plans (`PHASE-0..PHASE-3`) are immutable plans, this file +> tracks who is doing what *right now*. + +--- + +## How this works + +1. Every concrete work item lives in a phase plan as `T-.`. +2. Before starting work on a task, an agent: + - Pulls latest `main`. + - Edits the **Active Claims** table below to add a row claiming the + task with its branch name, owner handle, and timestamp. + - Commits that edit on `main` directly (small, low-conflict). + - Then opens the feature branch and works. +3. When the task is done (PR merged) the agent removes its claim row and + appends a one-line entry to **History**. +4. If a task needs to **change a frozen interface contract** (IC-1..IC-4 + from Phase 1, IC-2.1 from Phase 2), the agent must: + - Open a section under **Contract Change Requests** below. + - Wait for at least one other active agent (or the orchestrator) to + acknowledge by editing the row to `acked: `. + - Only then implement the change. + +The point: no central scheduler is required. A short structured edit on +`main` is the lock. + +--- + +## Phase Gate + +| Phase | Status | Notes | +|---|---|---| +| Phase 0 — Spike Stream | not started | First task. See `PHASE-0-spike-stream.md`. | +| Phase 1 — Sidecar | blocked on Phase 0 | Can begin only after Phase 0 verdict is green. | +| Phase 2 — iOS MVP | blocked on Phase 1 | Sidecar must be reachable and stable. | +| Phase 3 — iOS Augmentation | blocked on Phase 2 | Continuous after MVP ships. | + +Update the **Status** column when a phase transitions. Allowed states: +`not started`, `in progress`, `blocked on …`, `done`. + +--- + +## Active Claims + +| Task | Branch | Owner | Claimed at | ETA | Notes | +|---|---|---|---|---|---| +| _(none)_ | | | | | | + +Example of a filled row: +``` +| T-1.1 | feat/p1-t1-1-tmux-manager | @jay | 2026-05-20 14:00 | +2d | starting with manager.ts | +``` + +Rules: +- **One row per task.** A task can have only one active owner. +- **Owner** = the agent's handle (`@jay`, `@worker-1`, `@scout`, etc.). +- **ETA** is a rough estimate; missing it is OK, but if a row is stale > 2× ETA, anybody may reclaim after pinging. +- **Branch** must exist on the remote within 24h of the claim, otherwise + the row is considered abandoned and may be removed. + +--- + +## File Ownership Map + +For each high-traffic file, the table below lists the tasks that may +legitimately modify it. If you need to touch a file outside this list, +add a row or open a Contract Change Request. + +| File | Authorised Tasks | +|---|---| +| `extensions/remote-control/index.ts` | T-1.0, T-1.4 (events wiring only) | +| `extensions/remote-control/server.ts` (legacy) | nobody after T-1.0; legacy frozen | +| `extensions/remote-control/server/**` | T-1.0 (refactor), T-1.5, T-1.6, T-1.7 | +| `extensions/remote-control/tmux/**` | T-1.1 | +| `extensions/remote-control/buffer/**` | T-1.2 | +| `extensions/remote-control/sequence.ts` | T-1.2 | +| `extensions/remote-control/auth/**` | T-1.3, T-1.10 (device tokens only) | +| `extensions/remote-control/pi/**` | T-1.4 | +| `extensions/remote-control/apns/**` | T-1.10, Phase-2 T-2.9 (when iOS supplies tokens) | +| `extensions/remote-control/cli/**` | T-1.3, T-1.7 | +| `extensions/remote-control/config.ts` | T-1.7 | +| `docs/SYNC.md` | all (this file) | +| `docs/PHASE-*.md` | nobody once a phase has started (frozen plan) — open a CCR to amend | +| `docs/reference/**` | nobody during implementation — archival | + +For the iOS repo `pi-remote-ios`, an analogous map will be added when +Phase 2 kicks off. + +--- + +## Frozen Interface Contracts + +| ID | Defined in | Owner of changes | +|---|---|---| +| IC-1 — WebSocket frame protocol | `PHASE-1-sidecar.md` §Interface Contracts | T-1.5 lead, with sign-off from any active T-2.x owner | +| IC-2 — HTTP REST shape | `PHASE-1-sidecar.md` §Interface Contracts | T-1.5..T-1.7 leads | +| IC-3 — Pairing payload | `PHASE-1-sidecar.md` §Interface Contracts | T-1.3 lead | +| IC-4 — Config TOML schema | `PHASE-1-sidecar.md` §Interface Contracts | T-1.7 lead | +| IC-2.1 — `SessionConnection` Swift surface | `PHASE-2-ios-mvp.md` §Interface Contracts | T-2.5 lead | + +Once a contract is *frozen* (i.e. at least one consumer task has started +work that depends on it), changes require a CCR. + +--- + +## Contract Change Requests (CCR) + +Format: + +``` +### CCR-YYYY-MM-DD- +- **Contract**: IC-1 / IC-2 / … +- **Proposer**: @handle +- **Motivation**: 1-3 sentences. +- **Proposed change**: diff or prose. +- **Affected tasks**: list. +- **Status**: open | acked by @… | merged | rejected +``` + +_(none open)_ + +--- + +## Cross-Phase Notes + +Threads that don't belong in a single phase plan. + +- **Spec deviations.** If implementation reveals that a spec item is + wrong or unbuildable, write a short note here under **History**, plus + open a CCR if it changes a frozen contract. The spec itself stays + immutable until a v4 review round. +- **Risks materialised.** When a `Risks` row from a phase plan actually + hits, log it here with the workaround used. +- **Tools / shared scripts.** Anything added under `scripts/` that's + reused across tasks gets a one-liner here. + +--- + +## History + +Append-only log of completed work and notable events. One line each. + +``` +yyyy-mm-dd @handle T-x.y what was done +``` + +Example: +``` +2026-05-15 @jay init docs reorganised; phase plans + SYNC created +``` + +(populated as work happens) diff --git a/docs/ARCHITECTURE.md b/docs/reference/ARCHITECTURE.md similarity index 100% rename from docs/ARCHITECTURE.md rename to docs/reference/ARCHITECTURE.md diff --git a/docs/EXTENSION-API-AUDIT.md b/docs/reference/EXTENSION-API-AUDIT.md similarity index 100% rename from docs/EXTENSION-API-AUDIT.md rename to docs/reference/EXTENSION-API-AUDIT.md diff --git a/docs/reference/README.md b/docs/reference/README.md new file mode 100644 index 0000000..f90547b --- /dev/null +++ b/docs/reference/README.md @@ -0,0 +1,11 @@ +# Reference Documents + +Background and design artefacts. The implementation work is driven by the +phase plans in the parent `docs/` directory. + +| File | Purpose | +|---|---| +| [`SPEC-ios-app.md`](./SPEC-ios-app.md) | Final spec v3 for the iOS app and sidecar. The source of truth for what is being built. | +| [`EXTENSION-API-AUDIT.md`](./EXTENSION-API-AUDIT.md) | Audit of pi's ExtensionAPI — what's exposed, what's not, workarounds. Drives the realisability of S-07, S-08 and similar. | +| [`SPEC-ios-app-review-v1.md`](./SPEC-ios-app-review-v1.md) | Archived review thread of spec v1 → v2 with inline discussions. Historical, do not edit. | +| [`ARCHITECTURE.md`](./ARCHITECTURE.md) | Original architecture document for the existing `pi-remote-control` extension (HTML/WebSocket client). | diff --git a/docs/SPEC-ios-app-review-v1.md b/docs/reference/SPEC-ios-app-review-v1.md similarity index 100% rename from docs/SPEC-ios-app-review-v1.md rename to docs/reference/SPEC-ios-app-review-v1.md diff --git a/docs/SPEC-ios-app.md b/docs/reference/SPEC-ios-app.md similarity index 100% rename from docs/SPEC-ios-app.md rename to docs/reference/SPEC-ios-app.md