pi-fanout/README.md

# pi-fanout

Non-blocking async agent fanout for [pi](https://pi.earendil.dev).

## Problem

The built-in `subagent` tool is powerful, but its `execute()` blocks until **all**
dispatched agents finish. While parallel tasks run concurrently internally, the main
pi session is frozen waiting for the final result.

## Solution

`pi-fanout` turns subagent dispatch into a **true async job queue**:

- `fanout_dispatch` — returns immediately with job IDs; agents run as detached processes
- `fanout_status`  — poll running/done/failed counts at any time
- `fanout_collect` — retrieve final output from completed jobs
- `fanout_abort`   — kill running jobs on demand

The main pi session stays unblocked. You can do other work, dispatch more jobs,
and collect results whenever they’re ready. When a background job finishes, the
extension sends a `followUp` message into the session so you know it’s time to
collect.

## Architecture

```
pi main process
├─ pi-fanout extension
│  ├─ JobManager (in-memory + disk state in ~/.pi/fanout/jobs/)
│  └─ Poller (every 2s: check PIDs, notify on completion)
│
├─ detached pi child (Agent A) ── writes output to job dir
├─ detached pi child (Agent B) ── writes output to job dir
└─ detached pi child (Agent C) ── writes output to job dir
```

Jobs survive pi restarts because:
1. Child processes are **detached** from the parent
2. State is persisted as `meta.json` + `output.jsonl` per job
3. On startup the extension rehydrates old jobs and checks if their PIDs are still alive

## Install

```bash
pi use git:git.vpsj.de/jay/pi-fanout
```

Or clone into your extensions directory and add it to `~/.pi/extensions.json`.

## Usage

```
> fanout_dispatch tasks=[{agent:"worker", task:"Refactor auth.ts"}, {agent:"reviewer", task:"Review auth.ts"}]
Dispatched 2 job(s). IDs:
ltv123-abc
ltv124-def

> ... do other work in the main session ...

> fanout_status
Jobs: 2 total — 0 running, 0 queued, 2 done, 0 failed/aborted

> fanout_collect jobIds=["ltv123-abc","ltv124-def"]
[ltv123-abc] done (exit 0) [claude-sonnet-4]
Refactored auth.ts to use bearer tokens...

---

[ltv124-def] done (exit 0) [claude-sonnet-4]
The refactored auth.ts looks solid. One suggestion: ...
```

## Tools

### `fanout_dispatch`

Parameters:
- `tasks`: array of `{ agent, task, cwd?, model?, tools? }`
- `agentScope`: `"user" | "project" | "both"` (default `"user"`)

Returns: `{ dispatched: string[] }` — job IDs.

### `fanout_status`

Parameters:
- `jobIds?`: filter to specific IDs (omit for all)
- `includeDone?`: include finished jobs in listing (default `true`)

Returns per-job: `id`, `status`, `agent`, `task`, `exitCode`, `pid`, `turns`, `cost`.

### `fanout_collect`

Parameters:
- `jobIds`: array of IDs to collect

Returns per-job: `id`, `status`, `output` (final assistant text), `exitCode`, `usage`, `modelUsed`, `errorMessage`.

### `fanout_abort`

Parameters:
- `jobIds`: array of IDs to kill

Sends `SIGTERM`, then `SIGKILL` after 5s if still running.

## How it works with the agent loop

1. The LLM calls `fanout_dispatch`.
2. The tool returns **in <50ms** with job IDs.
3. The LLM is free to continue the conversation, run other tools, or ask the user.
4. Every 2 seconds the extension polls job PIDs.
5. When a job transitions to `done`/`failed`, the extension calls:
   ```ts
   pi.sendUserMessage(`Fanout job X completed ...`, { deliverAs: "followUp" })
   ```
   This injects a user-style message that triggers a follow-up turn when the agent is idle.
6. The LLM sees the notification and calls `fanout_collect` to retrieve outputs.
7. The LLM acts on the collected results (e.g. synthesize a final answer, dispatch follow-up jobs).

## Limitations / Roadmap

- **No true server push into a running turn**: If pi is mid-stream when a job finishes, the notification is queued as `followUp` and processed when the current turn ends.
- **No job result streaming** into the main session yet. Jobs are collected atomically after completion.
- **No automatic `fanout_collect`**: The LLM must explicitly call it. Future versions could auto-inject a tool-call hint.
- Jobs older than 24h are auto-cleaned on startup.

## License

MIT