Journal

An IDE for AI Coding

March 16, 2026 at 10:07 AM

My idea for Tug started out as an aid for AI code refactoring. I then expanded out into creating skills and agents to guide the way I created to go from from an idea to a mergeable pull request with AI coding assistance. This turned out great, and over the past several weeks, my plan/implement/merge flow has been how I make virtually all the changes to my software. Yet, the more I got into this work, the more I came to feel that running Claude Code in a terminal—as wonderful and mind-blowingly amazing as it is—isn’t where I spend all my daily software development hours.

I want an app—an IDE for the age of AI coding. So, I decided to do this for myself. These last two words, for myself, are key. I have no idea whether anyone else ever will want to use what I’m making. It’s not a goal, nor is it a non-goal. A for myself approach just the simplest and most straighforward way for me to arrive at what I consider to be good solutions. I ask, “What do I really want here?” And then I do that, or at least try to.

I’m also lucky in that I have no time constraints. I don’t work for anyone, and I don’t plan to anytime soon. This confers a freedom for me to pursue my interests in the things I find most compelling every day, and to take the time to “do things right” in a way I get to define for myself. I’m also pretty opinionated, and I just make choices that suit me on a wide variety of topics. So there. As I said, I’m lucky.

One of the most basic questions I’ve asked myself is what an AI coding IDE should even look like, and be like, and how I should build it. Obviously, conversing / “chatting” with an AI assistant is a big part of the overall approach, but at a level even more fundamental than that, I have a question about what tech stack I should use to build this IDE around.

Part of me really wishes that a native Mac app was the answer, but I feel like Apple has drifted for more than ten years about what it means to write serious apps for the Mac. For me, native Mac development is not the future—at least it’s not the lynchpin it once was. Pity. Electron, inevitably, seems to come with a ton of bloat. Pass.

I got some experience writing React apps for the web during my time at Infactory, the startup I used to work at, but I was frustrated by how uncontrolled the state and rendering system was. It felt like it encouraged spaghetti code with a dizzyingly over-connected chain of dependencies. But, I used AI to study some well-made open-source projects, like Excalidraw, which seem to have tamed the tangle, and I developed a set of rules for developing with React. I also adopted Radix Primitives as the core of my UI component system.

I do actually load this web tech into the thinnest Mac app I could possibly write. Mostly, it’s the approach that game developers take. Ask the native OS for a window, and but then steer away from the native app frameworks. This leaves me with so much to do to get an app I’d like to spend time in. But that’s OK. I have time.

For the past few days, I’ve been working at the meeting point between a set of core components (e.g. buttons, badges, sliders, checkboxes), a system for creating themes that provides flexibility for specifying different look-and-feel light/dark options while rendering these components in a legible and accessible fashion, and a color palette system for naming and specifying colors in a pleasant fashion that is amenable to the theme system and CSS.

There’s so much to talk about, and so much work to do. So for now, here are a few words on the color palette system.

TugColor Palette System

Overview

The TugColor palette is a named color vocabulary for the --tug-color() CSS notation. Every color has a human-readable name — no numeric codes, no degree offsets, no mental math. Colors are specified by four axes: color (hue name), intensity (chroma, 0–100), tone (lightness, 0–100), and alpha (opacity, 0–100).

This document describes the naming system. For the --tug-color() syntax and the intensity/tone/alpha axes, see the parser and palette engine module docs.

Basic system: 60 named colors

The basic vocabulary has 60 names across three categories.

48 chromatic colors

48 named hue families arranged in a circular ring, mapped to OKLCH hue angles. Names are drawn from gemstones, flowers, fruits, spices, pigments, and natural phenomena.

Listed in ring order by hue angle:

#	Name	Hue °
1	garnet	2.5°
2	cherry	10°
3	scarlet	15°
4	coral	20°
5	crimson	22.5°
6	red	25°
7	vermilion	30°
8	tomato	35°
9	ember	40°
10	flame	45°
11	tangerine	50°
12	orange	55°
13	apricot	60°
14	amber	65°
15	honey	70°
16	gold	75°
17	saffron	82.5°
18	yellow	90°
19	chartreuse	102.5°
20	lime	115°
21	grass	127.5°
22	green	140°
23	jade	147.5°
24	mint	155°
25	seafoam	165°
26	teal	175°
27	aqua	187.5°
28	cyan	200°
29	azure	207.5°
30	sky	215°
31	cerulean	222.5°
32	blue	230°
33	sapphire	240°
34	cobalt	250°
35	indigo	260°
36	violet	270°
37	iris	277.5°
38	purple	285°
39	grape	292.5°
40	plum	300°
41	orchid	310°
42	pink	320°
43	peony	327.5°
44	rose	335°
45	cerise	340°
46	magenta	345°
47	fuchsia	350°
48	berry	355°

The ring wraps: berry (355°) and garnet (2.5°) are adjacent.

Canonical lightness

Each hue has a canonical lightness (canonical L) — the OKLCH lightness at which that hue achieves an aesthetically-pleasing chroma. This is the lightness where the color looks most like itself: the most “natural” and “expected” version of a color (to me) that fits in the sRGB gamut. Canonical L varies significantly across hues because the sRGB gamut boundary is irregular in OKLCH space. Yellow is naturally bright at peak saturation (canonical L = 0.901), while cherry is naturally dark (canonical L = 0.619). These values are chosen aesthetically.

The tone axis (0–100) maps to OKLCH lightness through canonical L as a piecewise linear function with a hinge at tone 50:

tone  0  →  L_DARK  (0.15)     darkest
tone 50  →  canonical L        per-hue tuned lightness
tone 100 →  L_LIGHT (0.96)     lightest

The lower half (tone 0–50) interpolates linearly between L_DARK and canonical L. The upper half (tone 50–100) interpolates between canonical L and L_LIGHT. This means the tone scale is perceptually centered on each hue’s natural brightness — tone 50 always gives you a suitably canonical version of that color, regardless of whether the hue is naturally light (like yellow) or naturally dark (like cherry).

Each hue traces a line from L_DARK through its canonical L to L_LIGHT, and the varying heights of the hinge points show how different hues peak at different lightness levels.

11 achromatic colors

11 named values on a linear light-to-dark scale. The endpoints are black and white. The nine intermediates are named for craft and mark-making materials — writing surfaces on the light end, pigments and residues on the dark end.

Name	Tone	L (oklch)
black	0	0.000
pitch	10	0.220
ink	20	0.290
charcoal	30	0.360
carbon	40	0.430
graphite	50	0.500
vellum	60	0.592
parchment	70	0.684
linen	80	0.776
paper	90	0.868
white	100	1.000

These are fixed-lightness values. --tug-color(graphite) always means L=0.5, C=0. Intensity and tone parameters are ignored (with a warning). Alpha is honored.

The gray pseudo-hue remains available for continuous achromatic access at any tone: --tug-color(gray, t: 37) produces an arbitrary gray. Named grays are the fixed reference points; gray is the continuous slider.

1 transparent

--tug-color(transparent) expands to oklch(0 0 0 / 0). All parameters are ignored. Transparent does not participate in any adjacency system.

Extended system: 176 named colors

The extended vocabulary adds hyphenated adjacency pairs to the 60 basic names.

Chromatic adjacency (circular ring)

Any two adjacent colors on the 48-color hue ring can be hyphenated. The first name is dominant — it contributes 2/3 of the hue angle, the second contributes 1/3.

hue(A-B) = (2/3 × angle(A)) + (1/3 × angle(B))

Order matters. yellow-chartreuse and chartreuse-yellow are different colors:

Chip	Expression	Hue °
	yellow	90.0°
	yellow-chartreuse	94.2°
	chartreuse-yellow	98.3°
	chartreuse	102.5°

The ring is circular — berry (355°) and garnet (2.5°) are adjacent and wrap correctly across the 360°/0° boundary.

48 adjacent pairs × 2 orderings = 96 hyphenated chromatic colors.

Achromatic adjacency (linear sequence)

The 11 achromatic colors form a linear (non-wrapping) sequence. Black and white are not adjacent — there is no wrap. Adjacency uses the same 2/3 + 1/3 weighting, applied to lightness instead of hue angle.

L(A-B) = (2/3 × L(A)) + (1/3 × L(B))

10 adjacent pairs × 2 orderings = 20 hyphenated achromatic colors.

Adjacency rules

Only adjacent pairs are valid. yellow-chartreuse works because they are neighbors. yellow-blue is a hard error at parse time.
Order encodes bias. The first name gets 2/3 weight. A-B is always closer to A.
Non-adjacent pairs are rejected, not silently resolved. This catches typos and misunderstandings early.

Presets compose with adjacency

Presets (light, dark, intense, muted, canonical) can follow a hyphenated pair:

--tug-color(cobalt-indigo-intense)         /* hue from adjacency, i/t from preset */
--tug-color(cobalt-indigo-intense, t: 30)  /* preset with tone override */

The color token is parsed as a minus-separated chain of up to three idents: COLOR, COLOR-PRESET, COLOR-ADJACENT, or COLOR-ADJACENT-PRESET.

Color counts

Category	Base	Hyphenated	Total
Chromatic	48	96	144
Achromatic	11	20	31
Transparent	1	0	1
Total	60	116	176

Plan Skeleton v2

February 26, 2026 at 8:40 AM

I use a plan skeleton file to give Claude Code a structure and format to write plans based on my requests. I used my previous revision to create 44 code changes over the past two weeks. Yesterday afternoon, I analyzed the resulting plans to eliminate the parts from the skeleton that proved to be of little use and focus on the most useful conventions.

  ┌──────────┬──────────────┬───────┬─────────────┬──────────────────────┐
  │  Prefix  │  Convention  │ Plans │ Occurrences │       Verdict        │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ step-N   │ Steps        │ 44/44 │ 293         │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ dNN-     │ Decisions    │ 44/44 │ 343         │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ tNN-     │ Tables       │ 29/44 │ 91          │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ sNN-     │ Specs        │ 28/44 │ 120         │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ rNN-     │ Risks        │ 24/44 │ 53          │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ lNN-     │ Lists        │ 9/44  │ 15          │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ mNN-     │ Milestones   │ 5/44  │ 18          │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ qNN-     │ Questions    │ 3/44  │ 6           │ KEEP                 │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ cNN-     │ Concepts     │ 1/44  │ 2           │ DROP (only skeleton) │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ diagNN-  │ Diagrams     │ 1/44  │ 3           │ DROP (1 real plan)   │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ op-      │ Refactor ops │ 0/44  │ 1           │ DROP (skeleton only) │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ cmd-     │ CLI commands │ 0/44  │ 2           │ DROP (skeleton only) │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ type-    │ Schema types │ 0/44  │ 2           │ DROP (skeleton only) │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ seq-     │ Sequences    │ 0/44  │ 1           │ DROP (skeleton only) │
  ├──────────┼──────────────┼───────┼─────────────┼──────────────────────┤
  │ fixture- │ Fixtures     │ 0/44  │ 4           │ DROP (skeleton only) │
  └──────────┴──────────────┴───────┴─────────────┴──────────────────────┘

The conclusion was: 8 prefixes actively used, 7 dead. I also simplified the numbering system made a couple other small edits. The result is a more concise jumping-off point for detailed plans to accomplish complex code changes.

The new plan skeleton v2 shown below.

Files:

Plan Skeleton v2 (447 lines)

<!-- tugplan-skeleton v2 -->

## <Plan Title> {#phase-slug}

**Purpose:** <1–2 sentences. What capability ships at the end of this phase?>

---

### Plan Metadata {#plan-metadata}

| Field | Value |
|------|-------|
| Owner | <name> |
| Target branch | <branch> |
| Last updated | <YYYY-MM-DD> |

---

### Phase Overview {#phase-overview}

#### Context {#context}

<1–2 paragraphs. What problem are we solving, and why now?>

#### Strategy {#strategy}

<3–7 bullets. The approach and sequencing philosophy for this phase.>

#### Success Criteria (Measurable) {#success-criteria}

> Make these falsifiable. Avoid "works well".

- <criterion> (how to measure / verify)
- <criterion> (how to measure / verify)

#### Scope {#scope}

1. <Scope item>
2. <Scope item>
3. <Scope item>

#### Non-goals (Explicitly out of scope) {#non-goals}

- <Non-goal>
- <Non-goal>

#### Dependencies / Prerequisites {#dependencies}

- <Dependency>
- <Prerequisite>

#### Constraints {#constraints}

- <platform/tooling/perf/security constraints>

#### Assumptions {#assumptions}

- <assumption>
- <assumption>

---

### Reference and Anchor Conventions (MANDATORY) {#reference-conventions}

This plan format relies on **explicit, named anchors** and **rich `References:` lines** in execution steps.

#### 1) Use explicit anchors everywhere you will cite later

- **Technique**: append an explicit anchor to the end of a heading using `{#anchor-name}`.
  - Example:
    - `### Design Decisions {#design-decisions}`
    - `#### [D01] Workspace snapshots are immutable (DECIDED) {#d01-snapshots-immutable}`
- **Why**: do not rely on auto-generated heading slugs; explicit anchors are stable when titles change.

#### 2) Anchor naming rules (lock these in)

- **Allowed characters**: lowercase `a–z`, digits `0–9`, and hyphen `-` only.
- **Style**: short, semantic, **kebab-case**, no phase numbers (anchors should survive renumbering).
- **Prefix conventions (use these consistently)**:
  - **`step-N`**: execution step anchors, e.g. `{#step-1}`, `{#step-2}`, `{#step-3}`
  - **`step-N-M`**: substep anchors, e.g. `{#step-2-1}`, `{#step-2-2}`
  - **`dNN-...`**: design decisions (`[D01]`) anchors, e.g. `{#d01-sandbox-copy}`
  - **`qNN-...`**: open questions (`[Q01]`) anchors, e.g. `{#q01-import-resolution}`
  - **`rNN-...`**: risk notes (`Risk R01`) anchors, e.g. `{#r01-perf-regression}`
  - **`lNN-...`**: lists (`List L01`) anchors, e.g. `{#l01-supported-ops}`
  - **`mNN-...`**: milestones (`Milestone M01`) anchors, e.g. `{#m01-first-ship}`
  - **`sNN-...`**: specs (`Spec S01`) anchors, e.g. `{#s01-command-response}`
  - **Domain anchors**: for major concepts/sections, use a clear noun phrase, e.g. `{#cross-platform}`, `{#config-schema}`, `{#error-scenarios}`

#### 3) Stable label conventions (for non-heading artifacts)

Use stable labels so steps can cite exact plan artifacts even when prose moves around:

- **Design decisions**: `#### [D01] <Title> (DECIDED) {#d01-...}`
- **Open questions**: `#### [Q01] <Title> (OPEN) {#q01-...}`
- **Specs**: `**Spec S01: <Title>** {#s01-slug}` (or make it a `####` heading if you prefer)
- **Tables**: `**Table T01: <Title>** {#t01-slug}`
- **Lists**: `**List L01: <Title>** {#l01-slug}`
- **Risks**: `**Risk R01: <Title>** {#r01-slug}`
- **Milestones**: `**Milestone M01: <Title>** {#m01-slug}`

Numbering rules:
- Always use **two digits**: `D01`, `Q01`, `S01`, `T01`, `L01`, `R01`, `M01`.
- Never reuse an ID within a plan. If you delete one, leave the gap.

#### 4) `**Depends on:**` lines for execution step dependencies

Steps that depend on other steps must include a `**Depends on:**` line that references step anchors.

**Format:**
```markdown
**Depends on:** #step-1, #step-2
```

**Rules:**
- Use **anchor references** (`#step-N`), not step titles or numbers
- Omit the line entirely for steps with no dependencies (typically Step 1)
- Substeps implicitly depend on their parent step; only add explicit dependencies for cross-substep relationships
- Multiple dependencies are comma-separated
- Dependencies must reference valid step anchors within the document (validated by `tug validate`)

---

#### 5) `**References:**` lines are required for every execution step

Every step must include a `**References:**` line that cites the plan artifacts it implements.

Rules:
- Cite **decisions** by ID: `[D05] ...`
- Cite **open questions** by ID when the step resolves/de-risks them: `[Q03] ...`
- Cite **specs/lists/tables/risks/milestones** by label: `Spec S15`, `List L03`, `Tables T27-T28`, `Risk R02`, `Milestone M01`, etc.
- Cite **anchors** for deep links in parentheses using `#anchor` tokens (keep them stable).
- **Do not cite line numbers.** If you find yourself writing "lines 5–10", add an anchor and cite that instead.
- Prefer **rich, exhaustive citations**. Avoid `N/A` unless the step is truly refactor-only.

**Good References examples:**

```
**References:** [D05] Sandbox verification, [D12] Git-based undo, Spec S15, Tables T21-T25,
(#session-lifecycle, #worker-process-mgmt, #config-precedence)
```

```
**References:** [D01] Refactoring kernel, [D06] Python analyzer, List L04,
Table T05, (#op-rename, #fundamental-wall)
```

**Bad References examples (avoid these):**

```
**References:** Strategy section (lines 5–10)     ← uses line numbers
**References:** See design decisions above        ← vague, no specific citations
**References:** N/A                               ← only acceptable for pure refactor steps
```

---

### Open Questions (MUST RESOLVE OR EXPLICITLY DEFER) {#open-questions}

> Open questions are tracked work. If a question remains open at phase-end, explicitly defer it with a rationale and a follow-up plan.

#### [Q01] <Question title> (OPEN) {#q01-question-slug}

**Question:** <what is unknown / undecided?>

**Why it matters:** <what breaks or becomes expensive if we guess wrong?>

**Options (if known):**
- <option>
- <option>

**Plan to resolve:** <prototype / benchmark / spike / research / decision meeting>

**Resolution:** OPEN / DECIDED (see [DNN]) / DEFERRED (why, and where it will be revisited)

---

### Risks and Mitigations {#risks}

| Risk | Impact | Likelihood | Mitigation | Trigger to revisit |
|------|--------|------------|------------|--------------------|
| <risk> | low/med/high | low/med/high | <mitigation> | <trigger> |

**Risk R01: <Title>** {#r01-risk-slug}

- **Risk:** <1 sentence>
- **Mitigation:** <1–3 bullets>
- **Residual risk:** <what remains true even after mitigation>

---

### Design Decisions {#design-decisions}

> Record *decisions* (not options). Each decision includes the "why" so later phases don't reopen it accidentally.

#### [D01] <Decision Name> (DECIDED) {#d01-decision-slug}

**Decision:** <One sentence decision statement>

**Rationale:**
- <Why>
- <Why>

**Implications:**
- <What this forces in APIs / storage / tests>

---

### Deep Dives (Optional) {#deep-dives}

> Use this section for structured analysis that is not quite "decision" or "spec", but is critical for implementation alignment.
>
> Examples: operation analysis, end-to-end flows, protocols, schemas, sequence diagrams, CI/CD shape, cross-platform strategy, perf notes, rejection rationale.

#### <Topic Title> {#topic-slug}

<Write-up, diagrams, tables, and any referenced specs/lists/tables.>

---

### Specification {#specification}

> This section is the contract. Pick the subsections that apply to your plan; omit the rest.

- **Inputs and Outputs**: data model, invariants, supported formats
- **Terminology and Naming**: key terms and their definitions
- **Supported Features**: exhaustive list; include what is explicitly not supported
- **Modes / Policies**: behavioral variants, flags, policies
- **Semantics**: normative rules, traversal order, edge cases
- **Error and Warning Model**: error fields, warning fields, path formats
- **Public API Surface**: Rust/Python/language signatures
- **Internal Architecture**: component relationships, pipeline, ownership
- **Output Schemas**: CLI output, API responses, wire formats (contract)
- **Configuration Schema**: config file format, precedence, CLI flag mapping

---

### Compatibility / Migration / Rollout (Optional) {#rollout}

> Use this section when you are changing public APIs, config formats, CLI contracts, or anything that affects adopters.

- **Compatibility policy**: <semver? schema versioning?>
- **Migration plan**:
  - <what changes>
  - <who is impacted>
  - <how to migrate, and how to detect breakage>
- **Rollout plan**:
  - <opt-in flag / staged rollout / canary / feature gate>
  - <rollback strategy>

---

### Definitive Symbol Inventory {#symbol-inventory}

> A concrete list of new crates/files/symbols to add. This is what keeps implementation crisp.

#### New crates (if any) {#new-crates}

| Crate | Purpose |
|-------|---------|
| `<crate>` | <purpose> |

#### New files (if any) {#new-files}

| File | Purpose |
|------|---------|
| `<path>` | <purpose> |

#### Symbols to add / modify {#symbols}

| Symbol | Kind | Location | Notes |
|--------|------|----------|-------|
| `<Name>` | enum/struct/fn | `<path>` | <notes> |

---

### Documentation Plan {#documentation-plan}

- [ ] <Docs update>
- [ ] <Examples / schema examples / API docs>

---

### Test Plan Concepts {#test-plan-concepts}

> Describe the kinds of tests that prove the spec. Leave the actual enumeration of tests to the Execution Steps below.

#### Test Categories {#test-categories}

| Category | Purpose | When to use |
|----------|---------|-------------|
| **Unit** | Test individual functions/methods in isolation | Core logic, edge cases, error paths |
| **Integration** | Test components working together | End-to-end operations, CLI commands |
| **Golden / Contract** | Compare output against known-good snapshots | Schemas, APIs, parsers, serialization |
| **Drift Prevention** | Detect unintended behavior changes | Regression testing, API stability |

---

### Execution Steps {#execution-steps}

> Execution comes last. Each step should be executable, with a clear commit boundary and a checkpoint.
>
> **Commit after all checkpoints pass.** This rule applies to every step below.
>
> **Patterns:**
> - If a step is big, split into **substeps** (`Step 2.1`, `Step 2.2`, …) with separate commits and checkpoints.
> - After completing a multi-substep step, add a **Step N Summary** block that consolidates what was achieved and provides an aggregate checkpoint.
>
> **References are mandatory:** Every step must cite specific plan artifacts ([D01], Spec S01, Table T01, etc.) and anchors (#section-name). Never cite line numbers—add an anchor instead.

#### Step 1: <Prep Step Title> {#step-1}

<!-- Step 1 typically has no dependencies (it is the root) -->

**Commit:** `<conventional-commit message>`

**References:** [D01] <decision name>, (#strategy, #context)

**Artifacts:** (what this step produces/changes)
- <new files / new commands / new schema fields / new docs>

**Tasks:**
- [ ] <task>
- [ ] <task>

**Tests:**
- [ ] <T test>
- [ ] <T test>

**Checkpoint:**
- [ ] <command>
- [ ] <command>

---

#### Step 2: <Step Title> {#step-2}

**Depends on:** #step-1

**Commit:** `<conventional-commit message>`

**References:** [D02] <decision>, [D03] <decision>, Spec S01, List L01, (#terminology, #semantics)

**Artifacts:** (what this step produces/changes)
- <new files / new commands / new schema fields / new docs>

**Tasks:**
- [ ] <task>
- [ ] <task>

**Tests:**
- [ ] <T test>
- [ ] <T test>

**Checkpoint:**
- [ ] <command>
- [ ] <command>

---

#### Step 3: <Big Step Title> {#step-3}

**Depends on:** #step-2

> If this step is large, break it into substeps with separate commits and checkpoints.
> The parent step explains the structure; each substep has its own commit and checkpoint.
> Substeps implicitly depend on their parent step; explicit **Depends on:** only needed for cross-substep dependencies.

##### Step 3.1: <Substep Title> {#step-3-1}

**Commit:** `<conventional-commit message>`

**References:** [D04] <decision>, Spec S02, Table T01, (#inputs-outputs)

**Artifacts:** (what this substep produces/changes)
- <artifact>

**Tasks:**
- [ ] <task>

**Tests:**
- [ ] <test>

**Checkpoint:**
- [ ] <command>

---

##### Step 3.2: <Substep Title> {#step-3-2}

**Depends on:** #step-3-1

**Commit:** `<conventional-commit message>`

**References:** [D05] <decision>, (#public-api)

**Artifacts:** (what this substep produces/changes)
- <artifact>

**Tasks:**
- [ ] <task>

**Tests:**
- [ ] <test>

**Checkpoint:**
- [ ] <command>

---

#### Step 3 Summary {#step-3-summary}

> After a multi-substep step, add a summary block to consolidate what was achieved.

After completing Steps 3.1–3.N, you will have:
- <capability or artifact 1>
- <capability or artifact 2>
- <capability or artifact 3>

**Final Step 3 Checkpoint:**
- [ ] `<aggregate verification command covering all substeps>`

---

### Deliverables and Checkpoints {#deliverables}

> This is the single place we define "done" for the phase. Keep it crisp and testable.

**Deliverable:** <One sentence deliverable>

#### Phase Exit Criteria ("Done means…") {#exit-criteria}

- [ ] <criterion> (verification)
- [ ] <criterion> (verification)

**Acceptance tests:**
- [ ] <T test>
- [ ] <T test>

#### Roadmap / Follow-ons (Explicitly Not Required for Phase Close) {#roadmap}

- [ ] <follow-on item>
- [ ] <follow-on item>

| Checkpoint | Verification |
|------------|--------------|
| <checkpoint> | <command/test/proof> |

Software As Plans

February 24, 2026 at 6:30 AM

I began a refactoring project yesterday morning to address an architectural issue in Tug, specifically about how the project’s skills and agents interact with each other to implement the plans I give it.¹ This idea had kept me up during the night, which means that by time I had finished breakfast and started writing it out, I had been thinking about what I wanted to say for several hours.

About ninety minutes after I began, I had the 1700-word The Tug Multi-Agent Coordination Problem document you can see below². I pasted this into Claude Code and had a few rounds with the AI, producing the Conversation also included below. From these, I produced a design document. I continued on with several more rounds with Claude Code and GPT-5.3 Codex in Cursor, suing this design document to produce two plans: the first plan to implement Tugstate, a major new feature to deliver embedded SQLite state management to plan execution, and then a second plan to cut over to using it.

This process felt mostly like writing software has always felt. I spent about four hours puzzling over design choices, making tradeoffs, and deciding on what features are and are not in scope. The difference, of course, is that when the planning was done, I turned the coding over to Claude Code. It churned on the first plan for 2h 16m, and the second for 1h 45m. During the first AI coding session, I went for a long walk, and during the second, I went food shopping, wrote this post, and started to cook dinner.

The net result. Two PRs merged: +8168 -16100. The software worked for a simple case on the second try. The first attempt failed in a little trivial way that took two minutes to fix. I might have some more cleanup work to do today—actually I imagine this is likely—but that also would have been true if I had taken several more days to do this refeactor without the help of AI. This is a pace of code changes I simply could not have managed myself with one day of effort.

My point here is that, in the future, when people say they’re “writing software”, they aren’t going to mean cranking out code line by line— they’re going to mean creating and reviewing complex plans, which they then hand off to an AI to implement. For me, that future is already here.

The Tug Multi-Agent Coordination Problem

Tug has a multi-agent coordination problem.

At a high level, Tug should support taking an idea for a software change, handing it to agents, delivering the software, and repeating that loop with human oversight.

That workflow compresses into three phases:

PLAN: make a plan for a code change
IMPLEMENT: execute the code changes described by a plan
MERGE: merge completed implementation work back into main

Phases and Implications

Because MERGE is part of the workflow, git implications follow.

[I-01] Each plan refers to one and only one git repo.
[I-02] Each PLAN phase is committed on main when it completes, and must have a PLAN-REVISION that is a child of its base revision.
[I-03] Each PLAN refers to a base revision in git.
[I-04] Each PLAN has one or more STEPS describing required work.
[I-05] Each PLAN STEP has subsections including Depends on, Commit, References, Artifacts, Tasks, Tests, Checkpoint, and Rollback.
[I-06] Tasks, Tests, and Checkpoint sections include CHECKLISTS to enumerate required work in detail.
[I-07] Each IMPLEMENT phase must have an associated PLAN.
[I-08] Each PLAN must be VALIDATED before IMPLEMENT begins, including plan/step dependency-chain resolution.
[I-09] Each IMPLEMENT phase runs in a git worktree based on the same revision as its PLAN.
[I-10] Each IMPLEMENT phase must atomically fetch the next ready PLAN STEP, honoring dependencies.
[I-11] Each IMPLEMENT phase must atomically mark PLAN STEPS complete.
[I-12] MERGE must run only against a completed PLAN (all steps complete in IMPLEMENT).
[I-13] Each IMPLEMENT worktree can be deleted once merged.

Phases and Skills

Each phase is orchestrated by a skill that can call subagents.

PLAN
- Skill role: turn an idea into a validated, implementation-ready tugplan with explicit steps
- Subagents:
  - clarifier-agent: asks targeted questions to resolve ambiguity before drafting
  - author-agent: drafts and revises the tugplan from clarified requirements
  - critic-agent: reviews plan quality/completeness/implementability before approval
IMPLEMENT
- Skill role: execute the validated tugplan in a worktree and track step-by-step completion
- Subagents:
  - architect-agent: defines per-step strategy and expected touch scope
  - coder-agent: executes plan steps and writes code while monitoring drift
  - reviewer-agent: verifies quality and conformance to the approved plan
  - committer-agent: performs required git commits for completed work
  - auditor-agent: runs end-to-end quality gates (fresh build/tests)
  - integrator-agent: pushes branch, opens PR, confirms CI/check status
MERGE
- Skill role: mechanically integrate completed IMPLEMENT changes into main and clean up worktree state

Phases and Requirements

It must be easy for a developer to:

[R-01] Start a new PLAN phase at will.
[R-02] Stop an in-progress PLAN phase at will.
[R-03] Resume a previously started PLAN phase at will.
[R-04] Start IMPLEMENT for a VALIDATED PLAN at will.
[R-05] Stop an in-progress IMPLEMENT phase at will.
[R-06] Start MERGE for a completed PLAN at will.
[R-07] Stop an in-progress MERGE, if it has not yet completed.
[R-08] Track PLAN/IMPLEMENT/MERGE progress in detail, including:
- which phase instances are running
- what any running subagent is doing now
- which PLAN STEP IMPLEMENT is currently executing
- checklist state per step item (open/in-progress/completed)
[R-09] Receive clear feedback for logically inconsistent requests, such as:
- IMPLEMENT with no VALIDATED PLAN
- MERGE for a non-completed PLAN
- re-MERGE of an already completed PLAN

Skill <=> Subagent Communication

Subagents communicate through skills (orchestrator-mediated), not direct messaging:

[C-01] clarifier-agent -> author-agent (via PLAN): emits structured clarification (questions, assumptions, ambiguity analysis); PLAN combines with user answers and passes forward.
[C-02] author-agent -> critic-agent (via PLAN): returns plan metadata (plan_path, section/skeleton status); PLAN passes plan artifact for quality review.
[C-03] critic-agent -> author-agent (revision loop via PLAN): emits issues + recommendation (approve/revise/reject); PLAN either finalizes or loops feedback for revision.
[C-04] architect-agent -> coder-agent (via IMPLEMENT + beads design): emits per-step strategy (approach, expected_touch_set, implementation steps, test plan, risks); IMPLEMENT persists it as step design data.
[C-05] coder-agent -> reviewer-agent (via IMPLEMENT + beads notes): emits concrete changes + execution evidence (files_modified, build/test report, drift assessment); reviewer consumes for conformance and quality checks.
[C-06] reviewer-agent -> coder-agent (revision loop via IMPLEMENT): emits findings (issues, conformance, recommendation); IMPLEMENT either commits or routes fixes back to coder.
[C-07] committer-agent -> IMPLEMENT state/progress: consumes approved step context, performs commit operations (tugcode commit / git commit paths), returns metadata for progress tracking/step closure.
[C-08] auditor-agent -> coder-agent/integrator-agent (IMPLEMENT gate): emits post-loop quality-gate results (issues, deliverable checks, recommendation); IMPLEMENT routes fixes or publish handoff.
[C-09] integrator-agent -> coder-agent/user completion path (IMPLEMENT gate): emits publication outputs (pr_url, ci_status, checks); IMPLEMENT either closes or loops CI fixes.

Problems in the Current Code

Beads and Skill <=> Subagent Communication

Beads URL: https://github.com/steveyegge/beads

Beads strongly prefers a central database on main. That conflicts with IMPLEMENT running in git worktrees. MERGE then struggles with conflicts because both main and worktree changed. The intended direction is to keep main free of IMPLEMENT-phase coordination churn.

Operationally, Skill <=> Subagent communication is orchestrator-mediated and structured, but currently routed through Beads fields (design, notes) with impedance mismatches. This caused a long sequence of complexity-heavy commits and filesystem workarounds:

2ba6c10 Reduce orchestrator overthinking in implement
07b048d Move bead-write from agents to orchestrator
372c249 Remove positional content args from bead CLIs
4e24be1 Show explicit tool calls in bead-write instructions
c23504d Fix terminal hang from heredoc in agents
a5af8e4 Stop init from deleting .beads directories
1878990 Bypass bd init worktree guard
5d06f88 Fix bd init failure in git worktrees
55a9493 Clean up stale beads-at-root artifacts
76f3023 Clean up stale beads-at-root artifacts
4d7ec4f Beads Improvements (#43)
dad5ab5 Add beads improvements plan and proposal

[I-10] mostly works (open/close semantics are straightforward), but using Beads as source of truth for [I-11] failed. It does not represent full PLAN STEP checklist richness, which blocks delivery of [R-08] visibility.

It has also proven unexpectedly difficult to move information between subagents when mediated through bd/tugcode beads shell/file/heredoc paths.

Next Steps

Beads aims to be a distributed, git-backed graph issue tracker for AI agents, but this usage in Tug has not met needs.

We need to deliver:

[I-01] through [I-13]
[R-01] through [R-09]
[C-01] through [C-09]

Question: what architecture best satisfies those constraints with less operational complexity?

Conversation

After writing the content above, I began a conversation with Claude Code (Opus 4.6) to discuss it.

🧐 Me

OK. We have some design work to do. Read over @roadmap/multi-agent-coordination.md. Dig in. Read the document. Read the code. Investigate. Ask clarifying questions. Start making some proposals about what we might do to improve this project to deliver on its goals and requirements. Let’s solve this information flow problem with rock-solid solutions!

🤖 Claude

Both agents are done. I now have a thorough understanding of the entire system. Let me lay out what I’ve found and my proposals.

Diagnosis: What’s Actually Happening

After reading the roadmap, the skills, agents, Rust CLI commands, and core library, here’s my understanding of the problem.

What Beads does today

Beads serves four roles in the IMPLEMENT phase:

Step state machine: open -> closed (via bd close)
Dependency-aware ready queries: bd ready --parent <root> returns steps whose deps are all closed
Inter-agent data store: architect writes to design, coder writes to notes, reviewer appends to notes
Step identity: bead_mapping gives each step anchor a unique bead ID

Where Beads fails

Reading the code tells the story clearly:

The temp file dance (SKILL.md:310-318): after every agent call, the orchestrator must:

Agent writes temp file _tmp_{bead_id}_strategy.md
Orchestrator runs tugcode beads append-design {bead_id} --content-file {path}
Orchestrator deletes temp file
If CLI fails, warn and continue (because “the agent’s JSON output already contains all data needed”)

That last point is damning: the beads write is acknowledged as redundant. The orchestrator already has the data in memory. It’s writing to beads for persistence, but then telling itself it’s OK if that persistence fails.

Checklist blindness (roadmap line 97): Beads tracks step open/closed, but each step has tasks, tests, and checkpoints—fine-grained checklists that Beads can’t model. There’s no way to tell the user: “Step 2: 4/7 tasks done, 2/3 tests passing, 1/1 checkpoint verified.”

Worktree friction (beads.rs:229-234): Beads bd init refuses to run in git worktrees. The code manually creates .beads/, bootstraps SQLite via bd --db <path> config set, then passes --db on every command. This was a multi-commit struggle.

Main branch pollution (roadmap line 81): Beads wants data on main. The IMPLEMENT phase works on a worktree branch. Merge creates conflicts on data that should have stayed local.

What’s actually working fine

The plan phase: pure in-memory JSON between agents, no persistence needed
The orchestration pattern: sequential agent spawning with persistent agent IDs works well
The tugcode worktree create flow: creates worktree, branch, validates plan, returns structured JSON
The tugcode commit command: atomic log-rotate + git-commit + bead-close

Proposal: Replace Beads with Tugstate (v1)

I’m proposing a JSON-file-based state system that lives entirely in the worktree.

Core design

A single file: .tugtool/state.json in the worktree.

.tugtree/tugtool__my-plan-20260223/
├── .tugtool/
│   ├── state.json          ← NEW: all coordination state
│   ├── tugplan-my-plan.md
│   ├── tugplan-skeleton.md
│   ├── log.md
│   └── config.toml
├── .beads/                  ← REMOVED
│   └── beads.db
└── (source code)

The state file tracks everything the orchestrator and user need:

{
  "version": 1,
  "plan_path": ".tugtool/tugplan-my-plan.md",
  "phase": "implement",
  "created_at": "2026-02-23T10:00:00Z",
  "updated_at": "2026-02-23T10:45:00Z",
  "steps": [
    {
      "anchor": "step-0",
      "title": "Create API client",
      "status": "completed",
      "depends_on": [],
      "started_at": "2026-02-23T10:05:00Z",
      "completed_at": "2026-02-23T10:30:00Z",
      "commit_hash": "abc123d",
      "tasks": [
        { "text": "Create HTTP client struct", "status": "completed" },
        { "text": "Add retry logic", "status": "completed" }
      ],
      "tests": [
        { "text": "Unit test: retry backoff", "status": "completed" }
      ],
      "checkpoints": [
        { "text": "cargo test passes", "status": "completed" },
        { "text": "cargo clippy clean", "status": "completed" }
      ]
    },
    {
      "anchor": "step-1",
      "title": "Add caching layer",
      "status": "in_progress",
      "depends_on": ["step-0"],
      "started_at": "2026-02-23T10:31:00Z",
      "tasks": [
        { "text": "Implement cache store", "status": "completed" },
        { "text": "Add cache invalidation", "status": "in_progress" },
        { "text": "Wire up to API client", "status": "open" }
      ],
      "tests": [
        { "text": "Cache hit/miss test", "status": "open" }
      ],
      "checkpoints": [
        { "text": "cargo test passes", "status": "open" }
      ]
    }
  ]
}

Why this works

Single orchestrator, sequential execution. There’s one orchestrator process running agents one at a time. No concurrent writes means no need for SQLite locking or database transactions. A JSON file is perfectly atomic for this use case.

Delivers [R-08] completely. The state file tracks every checklist item at open/in_progress/completed granularity. tugcode status can render:

Step 0: Create API client [completed]
  Tasks:    2/2  ████████████ 100%
  Tests:    1/1  ████████████ 100%
  Checks:   2/2  ████████████ 100%

Step 1: Add caching layer [in_progress]
  Tasks:    1/3  ████░░░░░░░░  33%
  Tests:    0/1  ░░░░░░░░░░░░   0%
  Checks:   0/1  ░░░░░░░░░░░░   0%

Eliminates the temp file dance. The orchestrator writes state directly after parsing agent JSON output. No temp files, no CLI calls, no filesystem intermediaries.

Worktree-local, main stays clean. The state file lives in .tugtool/ inside the worktree. When the branch is merged, the state file can be dropped (or kept as a record). No conflicts.

Debuggable: cat .tugtool/state.json | jq .steps[1].tasks.

New CLI commands

tugcode state init <plan_path>      # Initialize state.json from plan's steps/checklists
tugcode state show [--json]         # Show progress (replaces `tugcode beads status`)
tugcode state next                  # Return next ready step (replaces `bd ready`)
tugcode state start <step-anchor>   # Mark step in_progress
tugcode state update <step-anchor>  # Update checklist items for a step
tugcode state complete <step>       # Mark step completed with commit hash

What changes in the orchestrator

┌────────────────────────────────────────────────────────────┬──────────────────────────────────────────────────────────────────┐
│                      Current (Beads)                       │                          New (Tugstate)                          │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ tugcode worktree create syncs beads                        │ tugcode worktree create calls tugcode state init                 │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ Returns bead_mapping, root_bead_id                         │ Returns all_steps, ready_steps (computed from state.json)        │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ After architect: tugcode beads append-design + delete temp │ After architect: tugcode state update step-0 --architect-done    │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ After coder: tugcode beads update-notes + delete temp      │ After coder: tugcode state update step-0 --tasks-completed t1,t2 │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ After reviewer: tugcode beads append-notes + delete temp   │ After reviewer: tugcode state update step-0 --tests-completed    │
├────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ tugcode commit closes bead                                 │ tugcode commit calls tugcode state complete step-0               │
└────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────┘

What about inter-agent data?

The key insight: the persistent agent pattern already solves this. The architect, coder, reviewer, and committer are each spawned once and resumed. They retain full context. The coder doesn’t read the architect’s strategy from Beads; the orchestrator passes it in the resume prompt.

The state file doesn’t need to store agent outputs (approach, expected_touch_set, etc.). It stores progress state: what’s done, what’s in flight, what’s next. Agent-to-agent data flows through orchestrator memory plus the Task resume mechanism.

Migration path

Phase 1: Add tugcode state commands + state.json schema. Keep Beads working.
Phase 2: Update implement skill to write to state.json alongside Beads. tugcode status reads state.json.
Phase 3: Remove Beads writes from implement skill. tugcode worktree create stops syncing Beads.
Phase 4: Remove Beads code from tugcode CLI and tugtool-core.

🧐 Me

OK. This is a great analysis. I love how simple it is. The main limitation with this approach is that it isolates the IMPLEMENT phase to a single git worktree on a single machine. Does that make sense?

There would never be any possibility of [R-04]: start an IMPLEMENT phase for a VALIDATED PLAN at will at any time. There is nothing in this requirement which says ”… only if there are no other IMPLEMENT phases running for that PLAN.”

If we ever want to allow multiple IMPLEMENT phases to run at once that don’t duplicate work, we would need a solution for [I-08] and [I-09], right? There is no way in your proposal to do concurrent IMPLEMENT phases. What are your ideas for dealing with this?

🤖 Claude

You’re absolutely right, and this is the critical design question.

The Concurrency Problem

With a JSON file in the worktree, there’s no shared state. Two IMPLEMENT phases for the same plan can’t coordinate:

Orchestrator A reads state.json, sees step-0 ready, starts step-0.
Orchestrator B reads state.json, sees step-0 ready, starts step-0.
Duplicate work.

For [I-08] and [I-09] to work with concurrent IMPLEMENT phases, you need:

Shared state visible to all orchestrators
Atomic claim so only one orchestrator can grab a step

Where the shared state lives:

repo/
├── .tugtool/
│   ├── tugplan-foo.md
│   └── state.db              ← shared across all worktrees
├── .tugtree/
│   ├── worktree-A/           ← orchestrator A (claimed step-0, step-2)
│   └── worktree-B/           ← orchestrator B (claimed step-1, step-3)

Revised Proposal: Embedded SQLite

This changes the answer from JSON to SQLite—but not Beads. A self-owned, embedded SQLite database with a schema designed for Tug.

Why SQLite instead of JSON:

BEGIN EXCLUSIVE transaction gives atomic claims
Concurrent reads are free (WAL mode)
Still a single file (.tugtool/state.db), still debuggable
No external binary: rusqlite compiles SQLite into tugcode

Why not Beads:

Own schema, no impedance mismatch
No bd binary dependency
No worktree init friction
Checklist items are first-class rows

Schema (draft):

CREATE TABLE plans (
  plan_path  TEXT PRIMARY KEY,
  status     TEXT NOT NULL DEFAULT 'active',
  created_at TEXT NOT NULL,
  updated_at TEXT NOT NULL
);

CREATE TABLE steps (
  plan_path    TEXT NOT NULL REFERENCES plans(plan_path),
  anchor       TEXT NOT NULL,
  title        TEXT NOT NULL,
  status       TEXT NOT NULL DEFAULT 'pending',
  claimed_by   TEXT,
  started_at   TEXT,
  completed_at TEXT,
  commit_hash  TEXT,
  PRIMARY KEY (plan_path, anchor)
);

CREATE TABLE step_deps (
  plan_path   TEXT NOT NULL,
  step_anchor TEXT NOT NULL,
  depends_on  TEXT NOT NULL,
  PRIMARY KEY (plan_path, step_anchor, depends_on),
  FOREIGN KEY (plan_path, step_anchor) REFERENCES steps(plan_path, anchor),
  FOREIGN KEY (plan_path, depends_on)  REFERENCES steps(plan_path, anchor)
);

CREATE TABLE checklist_items (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  plan_path   TEXT NOT NULL,
  step_anchor TEXT NOT NULL,
  kind        TEXT NOT NULL,
  text        TEXT NOT NULL,
  status      TEXT NOT NULL DEFAULT 'open',
  updated_at  TEXT,
  FOREIGN KEY (plan_path, step_anchor) REFERENCES steps(plan_path, anchor)
);

Atomic claim operation:

BEGIN EXCLUSIVE;

SELECT s.anchor FROM steps s
WHERE s.plan_path = ?
  AND s.status = 'pending'
  AND NOT EXISTS (
    SELECT 1 FROM step_deps d
    JOIN steps dep ON dep.plan_path = d.plan_path AND dep.anchor = d.depends_on
    WHERE d.plan_path = s.plan_path
      AND d.step_anchor = s.anchor
      AND dep.status != 'completed'
  )
ORDER BY s.anchor
LIMIT 1;

UPDATE steps
SET status = 'claimed', claimed_by = ?, started_at = ?
WHERE plan_path = ? AND anchor = ? AND status = 'pending';

COMMIT;

CLI commands:

tugcode state init <plan_path>
tugcode state next <plan_path>
tugcode state start <plan> <step>
tugcode state update <plan> <step>
tugcode state complete <plan> <step>
tugcode state show [plan] [--json]

What changes for worktree create:

Call tugcode state init <plan> if missing (idempotent)
Call tugcode state next <plan> to claim the first ready step
Return that step info to the orchestrator

No Beads. No .beads/ directory. No bd binary.

Future multi-machine path:

Wrap SQLite operations behind tugcast endpoints
Keep schema and semantics
Or swap to Postgres later if needed

Summary shift:

┌──────────────────────┬──────────────────────┬──────────────────────────┬─────────────────────────────┐
│                      │   Beads (current)    │ JSON (first proposal)    │ SQLite (revised)            │
├──────────────────────┼──────────────────────┼──────────────────────────┼─────────────────────────────┤
│ External binary      │ bd required          │ none                     │ none (rusqlite embedded)    │
│ Concurrent IMPLEMENT │ partially            │ impossible               │ full atomic claims          │
│ Checklist tracking   │ no                   │ yes                      │ yes                         │
│ Lives in             │ worktree .beads/     │ worktree .tugtool/       │ repo root .tugtool/         │
│ Main branch clean    │ no                   │ yes                      │ yes (gitignore DB)          │
│ Debuggable           │ bd show              │ cat/jq                   │ sqlite3                     │
│ Multi-machine path   │ none                 │ none                     │ tugcast wrapper or Postgres │
└──────────────────────┴──────────────────────┴──────────────────────────┴─────────────────────────────┘

The core insight is right: coordination state must be shared and atomic. JSON can’t provide that, SQLite can, without Beads friction.

🧐 Me

OK. If we’re using a central database in the repo root, then:

How do we merge and avoid all the messy merge mechanics we eliminated in .tugtool/tugplan-beads-improvements.md and 4d7ec4f?
Why SQLite instead of Dolt? What would Dolt give us that SQLite doesn’t?

🤖 Claude

Two sharp questions.

Merge: How do we avoid the mess?

The Beads merge problem was that both main and worktree branches committed a binary SQLite DB. Git can’t merge that cleanly.

Solution: never commit the state DB.

.tugtool/state.db
.tugtool/state.db-wal
.tugtool/state.db-shm

state.db is an operational artifact, not source code.

Committed on worktree branch:

Source code changes
Plan file (.tugtool/tugplan-foo.md) unchanged during IMPLEMENT
Implementation log (.tugtool/log.md)

Not committed:

.tugtool/state.db

So merge is source-only, and conflict risk drops dramatically.

For progress reporting, the plan file remains the immutable spec; execution progress is read from tugcode state show.

Why SQLite, not Dolt?

Dolt is interesting:

Data branching/merging
SQL history by revision
SQL conflict resolution
Built-in audit history

But likely too heavy for Tug right now:

Large external binary footprint
External server/process lifecycle complexity
Not embeddable in the tugcode binary
Overkill for Tug’s small state volume
Introduces a second branching model (Git + Dolt)

SQLite advantages here:

Embedded, no extra install
No daemon/server/port/process management
Proven local concurrency with WAL + transactional claims
Small and inspectable
Clear path later to tugcast/Postgres if requirements change

Recommendation: embedded SQLite in repo root, gitignored, atomic on one machine.

🧐 Me

I hear you on Dolt, and I agree it solves a harder problem than Tug has right now.

A gitignored SQLite DB in the repo root feels like the sweet spot. For [I-08] and [I-09], I’m comfortable defining atomically as atomically on one machine for now. We don’t currently have a multi-machine/multi-developer requirement.

So next step: write a complete proposal in @roadmap/ based on @roadmap/multi-agent-coordination.md and this chat. Cover risks, and structure it so we can execute in two plans:

Add SQLite-based mechanism while Beads infra still exists
Use that mechanism to remove Beads dependency cleanly

That should avoid the chicken-and-egg problem and let us run two tugplans back-to-back.

I has been using beads for issue tracking. I’ve long been a fan of his prose writing of Steve Yegge, the developer of beads, but in the end, the beads feature set sounded like a better fit than it turned out to be in practice. Little impedance mismatches started piling up, and I wound up fighting with the software more than I should have. So, I decided to roll my own issue tracking solution with SQLite and some Rust code. ↩
I wrote most of the document the “old-fashioned” way, one word at a time, except that I asked the AI to fill in the agent descriptions under the Skill <=> Subagent Communication section. link ↩

A Bigger Vision For Tug

February 16, 2026 at 5:59 AM

tl;dr

I’m pivoting this project. Tug is now about creating an AI-assisted software construction system, thinking through and delivering the tools and techniques humans and coding assistants need to make better software together.

I started this Tug project to scratch an itch. I noticed that Claude Code would often struggle to do complex code refactoring, issuing long chains of grep, sed, and awk commands to change the text of my programs lexically rather than understanding how to change my code semantically.

So, I started making proper code refactoring tools to make available to the assistant, starting with python—a substantial project as I conceived of it. I began with a combination of Claude Code and Cursor, but soon started leaning more on Claude Code, extending it with skills and subagents to help me in my daily coding. I acted as the orchestrator, calling on these new tasks as I made and improved them. Eventually, I produced a workflow I liked well enough to produce work I was happy to commit. I wrote it up in How I Write Code With AI.

After I finished writing this post, I took a step back and looked at the 20-step process I included. I felt a mix of fascination and frustration. I was fascinated about how such a workflow had become possible. I was frustrated that I wasn’t working more directly on this new style of software construction.

So, I decided to do something about it. I kicked off a side project that I called “specks”¹, to spend more time with that workflow, to think about it more deeply, and to see if I could turn it into an automated process orchestrated by AI agents.

I succeeded. What an interesting two weeks of work! Talk about fascination!

My planning and implementation loops are now fully under the control of two top-level orchestrator skills in Claude Code. Each of them call on a number of subagents (see below). I also added one additional step at the end to merge code after the implementation loop completes, since the implementation now happens on a git worktree. I also integrated beads to bolster the implementation infrastructure.

Now, my main workflow now has four steps rather than twenty. All focused around Claude Code.

I chat with Claude Code to explore a new idea until I’m satisfied that it represents a piece of work I want to do.
I run /tugtool:plan, and reference the context. It makes a plan file. (Subagents: clarifier-agent → author-agent → critic-agent, with an author/critic revision loop; tugtool calls: tugtool init pre-hook.)
I run /tugtool:implement with the plan just created. It writes the code and sends me a PR. I review the PR. If I’m satisfied (and I usually am)… (Subagents: architect-agent → coder-agent → reviewer-agent → committer-agent → auditor-agent → integrator-agent; tugtool calls: tugtool worktree create --json.)
I run /tugtool:merge to pull that code onto my main branch. (Subagents: none; tugtool calls: tugtool merge --dry-run --json → confirm → tugtool merge --json → tugtool doctor + tugtool worktree list.)

This side project accomplished two things:

I made my workflow more efficient by taking a useful workflow and taught the AI to do it for me.²
I changed my work from developing a code refactoring tool—solving a problem that I occasionally have—to one where I was improving my everyday workflow—solving a problem I have all the time.

This was just too fascinating to keep as a side project. It started to feel like it should be my main project.So, I’m now pivoting Tug to be a project about software construction. This new agent flow is now the core of the focus—at least for now. It’s also given me a whole new set of ideas for building a more full-featured experience—something that imagines a future of software construction beyond the age of IDEs like vscode and Xcode. I’ve already sketched out a roadmap, some exiciting new ideas for how this might fit together. I also made a comprehensive update to the sticky What is Tug always present in the header, to describe the new project concept.

I have more posts planned to share additional thoughts about new direction, but until then, here are the new agents that make up the core of this new workflow.

Agent Orchestration Loops


PLAN ORCHESTRATION LOOP

┌──────────────────────────────────────────┐
│          PLANNING PHASE BEGINS           │
│ (produce a tugplan at .tugtool/tugplan)  │
└────────────────────┬─────────────────────┘
                     │
                     ▼
┌──────────────────────────────────────────┐
│        clarifier-agent (runs once)       │
│        SPAWN → clarifier_id              │
└────────────────────┬─────────────────────┘
                     │
                     ▼
               ┌────────────┐
               │ questions? │
               └──┬─────┬───┘
              yes │     │ no
                  ▼     │
  ┌──────────────────┐  │
  │ AskUserQuestion  │  │
  └────────┬─────────┘  │
           └──────┬─────┘
                  │
┌─────────────────▼────────────────────────┐
│ author-agent                             │
│ Pass 0: SPAWN (FRESH) → author_id        │◄─┐
│ Pass N: RESUME author_id                 │  │
└────────────────────┬─────────────────────┘  │
                     │                        │
                     ▼                        │
┌──────────────────────────────────────────┐  │
│ critic-agent                             │  │ revision
│ Pass 0: SPAWN (FRESH) → critic_id        │  │ loop
│ Pass N: RESUME critic_id                 │  │
└────────────────────┬─────────────────────┘  │
                     │                        │
                     ▼                        │
             ┌────────────────┐               │
             │    critic      │               │
             │recommendation? │               │
             └──┬──────────┬──┘               │
        APPROVE │          │ REVISE / REJECT  │
                │          └──────────────────┘
                ▼
┌──────────────────────────────────────────┐
│        PLANNING PHASE COMPLETE           │
│ Plan ready at {plan_path}                │
│ Next: /tugtool:implement {plan_path}     │
└──────────────────────────────────────────┘

IMPLEMENT ORCHESTRATION LOOP

┌──────────────────────────────────────────┐
│  tugtool worktree create <plan> --json   │
└────────────────────┬─────────────────────┘
                     │
                     ▼
              ┌────────────┐
              │ succeeded? │
              └──┬──────┬──┘
             yes │      └ no ──► HALT WITH ERROR
                 │
                 ▼

     ═══ STEP LOOP (each ready step) ═══

┌──────────────────────────────────────────┐
│ architect-agent                          │
│ Pass 0: SPAWN → architect_id             │
│ Pass N: RESUME architect_id              │
└────────────────────┬─────────────────────┘
                     │
                     ▼
┌──────────────────────────────────────────┐
│ coder-agent                              │
│ Pass 0: SPAWN → coder_id                 │
│ Pass N: RESUME coder_id                  │
└────────────────────┬─────────────────────┘
                     │
                     ▼
              ┌────────────┐
              │   drift?   │
              └──┬──────┬──┘
         none/   │      │ moderate/major
         minor   │      └──► AskUserQuestion
                 │               │
                 │◄──────────────┘
                 │
                 ▼
┌──────────────────────────────────────────┐
│ reviewer-agent                           │◄──┐
│ Pass 0: SPAWN → reviewer_id              │   │
│ Pass N: RESUME reviewer_id               │   │
└────────────────────┬─────────────────────┘   │
                     │                         │
                     ▼                         │ review
             ┌───────────────┐                 │ retry
             │   reviewer    │                 │
             │recommendation?│                 │
             └──┬─────────┬──┘                 │
        APPROVE │         │ REVISE (max 3)     │
                │         └─► coder fix ───────┘
                ▼
┌──────────────────────────────────────────┐
│ committer-agent                          │
│ SPAWN/RESUME → commit + close bead       │
└────────────────────┬─────────────────────┘
                     │
             ┌───────────────┐
             │  more steps?  │─ yes ─► back to architect
             └───────┬───────┘
                     │ no
                     ▼

          ═══ AUDITOR PHASE ═══

┌──────────────────────────────────────────┐
│               auditor-agent              │◄─────────────┐
│               SPAWN/RESUME               │              │
└────────────────────┬─────────────────────┘              │
                     │                                    │
                     ▼                                    │ audit
             ┌───────────────┐                            │ retry
             │    auditor    │                            │
             │recommendation?│                            │
             └──┬─────────┬──┘                            │
           PASS │         │ REVISE (max 3)                │
                │         └─► coder fix → committer ──────┘
                ▼

        ═══ INTEGRATOR PHASE ═══

┌──────────────────────────────────────────┐
│            integrator-agent              │◄─────────────┐
│            SPAWN/RESUME → push, PR, CI   │              │
└────────────────────┬─────────────────────┘              │
                     │                                    │
                     ▼                                    │ CI
             ┌───────────────┐                            │ retry
             │  integrator   │                            │
             │recommendation?│                            │
             └──┬─────────┬──┘                            │
           PASS │         │ REVISE (max 3)                │
                │         └─► coder fix → committer ──────┘
                ▼

┌──────────────────────────────────────────┐
│         IMPLEMENTATION COMPLETE          │
│      Plan: {plan_path}  PR: {pr_url}     │
└──────────────────────────────────────────┘

A homonym triple entendre, for those keeping score. spec, as in a software specification; speck, as in a small idea; specs as in lenses that help us to look closely at something ↩
This a pattern I’ve repeated again and again over the past five years: learn by myself then teach the AI. ↩

How I Write Code With AI

January 31, 2026 at 10:46 AM

This is how I write code these days. I use a 20-step process to code with AI, actively collaborating with assistants in a workflow with two main loops: Planning and Implementation.

The Planning Loop: I start by defining a phase of work, then use an AI agent to generate a structured plan file. I review this plan with one AI model using an “investigate” prompt that encourages critical analysis. I then cross-reference with a different AI model using the same prompt. I iterate between these two AIs for 10-20 rounds, cross-pollinating their feedback with my own guidance until the plan is solid—questions resolved, decisions documented, steps specific enough to implement.
The Implementation Loop: For each plan step, I assess whether it’s ready or needs breakdown via an architectural analysis agent. I run an implementation command, which executes the step while I monitor terminal output. If the AI struggles, I pause, investigate with agents, update the plan, and resume. After completion, I optionally use a review agent to grade the work against the plan’s requirements. I then run commands to log what was done and draft a commit message, and commit manually.

The key principles: multiple AI perspectives catch blind spots, the plan file is the single source of truth, and I retain control over git operations.

I figure that my productivity is 5–10× what it was before AI coding assistants, certainly in lines of code produced, but also in the speed I can make my ideas real. That’s a quantitative assesment.

There’s also the qualitative aspect: I’m more willing to dive in on an investigation or refactor because I know that the drudgery of fixing up the collateral damage is something I can offload to the AI. It leaves me free to think about the details, features, and APIs I want rather than having to struggle to achieve them.

Yes, I sometimes get all the way to the bottom of a “properly implemented” step and then discover that the work is a complete botch-up. No big deal, I go back and try again. This happened all the time in my “manual coding” days, so I’m used to it.

No, I don’t read every line of code before committing. In fact, I read very few of them. I rely heavily on unit tests, integration tests, and the ease of using previously-completed steps as foundational elements in follow-on work. Problems surface before too long.

Over my whole career, I’ve always tried to figure out ways to do better. AI is the best new tool to land in my toolbox in a long time.

Workflow Overview — ASCII flowchart with labeled nodes
Definitions — Key terms and concepts
Planning Loop — [P1]-[P6]: From idea to ready-to-implement plan
Implementation Loop — [I7]-[I14]: From plan step to committed code
Key Design Principles — Why this workflow works
Files Reference — Quick lookup table

Workflow Overview

The flowchart below shows the two main loops: Planning ([P1]-[P6]) and Implementation ([I7]-[I14]). Each labeled node is a clickable reference to its detailed description.

                    ┌─────────────────────────────────────────────┐
                    │           PHASE OF WORK BEGINS              │
                    │  (new feature, refactor, exploration, etc.) │
                    │                  [P1]                       │
                    └─────────────────────┬───────────────────────┘
                                          │
                                          ▼
┌───────────────────────────────────────────────────────────────────────────────────────┐
│                              PLANNING LOOP [P2]-[P6]                                  │
│  ┌─────────────────────────────────────────────────────────────────────────────────┐  │
│  │                                                                                 │  │
│  │    ┌──────────────────┐                                                         │  │
│  │    │  [P2]            │◄────────────────────────────────────────────────────┐   │  │
│  │    │  code-planner    │                                                     │   │  │
│  │    │  agent creates   │                                                     │   │  │
│  │    │  plan file       │                                                     │   │  │
│  │    └────────┬─────────┘                                                     │   │  │
│  │             │                                                               │   │  │
│  │             ▼                                                               │   │  │
│  │      ┌──────────────────┐       ┌──────────────────┐                        │   │  │
│  │   ┌─►│  [P3]            │──────►│  [P5]            │                        │   │  │
│  │   │  │  "investigate"   │       │  "investigate"   │                        │   │  │
│  │   │  │  + my comments   │       │  prompt w/       │                        │   │  │
│  │   │  │  + questions     │       │  GPT-5.2 in      │                        │   │  │
│  │   │  │  w/ Claude Opus  │       │  Cursor          │                        │   │  │
│  │   │  └────────┬─────────┘       └────────┬─────────┘                        │   │  │
│  │   │           │                          │                                  │   │  │
│  │   │           ▼                          │                                  │   │  │
│  │   │  ┌──────────────────┐                │                                  │   │  │
│  │   │  │  [P4]            │                │                                  │   │  │
│  │   │  │  code-planner    │                │                                  │   │  │
│  │   │  │  for major       │                │                                  │   │  │
│  │   │  │  rewrites        │                │                                  │   │  │
│  │   │  └────────┬─────────┘                │                                  │   │  │
│  │   │           │                          │                                  │   │  │
│  │   │           └──────────────────────────┼───────────── back to [P2] ───────┘   │  │
│  │   │                                      │                                      │  │
│  │   │                                      ▼                                      │  │
│  │   │                            ┌────────────────────────────────┐               │  │
│  │   │                            │  [P6]                          │               │  │
│  │   │                            │  Cross-pollinate AI feedback   │               │  │
│  │   │                            │  + my guidance/annotations     │               │  │
│  │   │                            │  (10-20 rounds typical)        │               │  │
│  │   │                            └─────────────┬───┬──────────────┘               │  │
│  │   │                                          │   │                              │  │
│  │   │                                    loop  │   │ ready                        │  │
│  │   └─────────── back to [P3] ◄────────────────┘   │                              │  │
│  │                                                  │                              │  │
│  └──────────────────────────────────────────────────│──────────────────────────────┘  │
│                                                     │                                 │
│                                                     │                                 │
│                            Plan is ready  ┌─────────┘                                 │
│                                           │                                           │
└───────────────────────────────────────────┼───────────────────────────────────────────┘
                                            │
                                            ▼
┌───────────────────────────────────────────────────────────────────────────────────────┐
│                          IMPLEMENTATION LOOP [I7]-[I14]                               │
│  ┌─────────────────────────────────────────────────────────────────────────────────┐  │
│  │                                                                                 │  │
│  │    ┌───────────────────┐                                                        │  │
│  │    │  [I7]             │◄───────────────────────────────────────────────────┐   │  │
│  │    │  Assess step:     │                                                    │   │  │
│  │    │  ready as-is, or  │                                                    │   │  │
│  │    │  needs breakdown? │                                                    │   │  │
│  │    └────────┬──────────┘                                                    │   │  │
│  │             │                                                               │   │  │
│  │             ├───────────────────────────┐                                   │   │  │
│  │             │ needs detail              │ ready                             │   │  │
│  │             ▼                           │                                   │   │  │
│  │    ┌──────────────────┐                 │                                   │   │  │
│  │    │  [I7-detail]     │                 │                                   │   │  │
│  │    │  code-architect  │                 │                                   │   │  │
│  │    │  agent provides  │                 │                                   │   │  │
│  │    │  detailed steps  │                 │                                   │   │  │
│  │    └────────┬─────────┘                 │                                   │   │  │
│  │             │                           │                                   │   │  │
│  │             ▼                           ▼                                   │   │  │
│  │    ┌─────────────────────────────────────────┐                              │   │  │
│  │    │  [I8]                                   │◄──────────────┐              │   │  │
│  │    │  /implement-plan command                │               │              │   │  │
│  │    │  (cite plan step, watch terminal)       │               │              │   │  │
│  │    └────────────────┬────────────────────────┘               │              │   │  │
│  │                     │                                        │              │   │  │
│  │                     │ [I9] monitor                           │              │   │  │
│  │                     │                                        │              │   │  │
│  │         ┌───────────┴───────────┐                            │              │   │  │
│  │         │ struggling?           │ progressing                │              │   │  │
│  │         ▼                       │                            │              │   │  │
│  │    ┌──────────────────┐         │                            │              │   │  │
│  │    │  [I10]           │         │                            │              │   │  │
│  │    │  ESC to pause    │         │                            │              │   │  │
│  │    │  ▼               │         │                            │              │   │  │
│  │    │  code-architect  │         │                            │              │   │  │
│  │    │  or code-planner │         │                            │              │   │  │
│  │    │  to investigate  │         │                            │              │   │  │
│  │    │  ▼               │         │                            │              │   │  │
│  │    │  update plan     │         │                            │              │   │  │
│  │    └────────┬─────────┘         │                            │              │   │  │
│  │             │                   │                            │              │   │  │
│  │             └───────────────────┼─────────► back to [I8] ────┘              │   │  │
│  │                                 │                                           │   │  │
│  │                                 ▼                                           │   │  │
│  │    ┌───────────────────────────────────┐                                    │   │  │
│  │    │  Implementation complete          │                                    │   │  │
│  │    │  Questions about quality?         │                                    │   │  │
│  │    └────────────────┬──────────────────┘                                    │   │  │
│  │                     │                                                       │   │  │
│  │         ┌───────────┴───────────┐                                           │   │  │
│  │         │ yes                   │ no                                        │   │  │
│  │         ▼                       │                                           │   │  │
│  │    ┌─────────────────────┐      │                                           │   │  │
│  │    │  [I11]              │      │                                           │   │  │
│  │    │  plan-step-reviewer │      │                                           │   │  │
│  │    │  grades the work    │      │                                           │   │  │
│  │    └────────┬────────────┘      │                                           │   │  │
│  │             │                   │                                           │   │  │
│  │    remediation needed?          │                                           │   │  │
│  │             │                   │                                           │   │  │
│  │             │ yes               │                                           │   │  │
│  │             └──────────────────────────────► back to [I7] ──────────────────│   │  │
│  │                                 │                                           │   │  │
│  │                                 ▼                                           │   │  │
│  │    ┌─────────────────────────────────────────┐                              │   │  │
│  │    │  [I12]                                  │                              │   │  │
│  │    │  /update-plan-implementation-log        │                              │   │  │
│  │    └────────────────┬────────────────────────┘                              │   │  │
│  │                     │                                                       │   │  │
│  │                     ▼                                                       │   │  │
│  │    ┌─────────────────────────────────────────┐                              │   │  │
│  │    │  [I13]                                  │                              │   │  │
│  │    │  /prepare-git-commit-message            │                              │   │  │
│  │    └────────────────┬────────────────────────┘                              │   │  │
│  │                     │                                                       │   │  │
│  │                     ▼                                                       │   │  │
│  │    ┌─────────────────────────────────────────┐                              │   │  │
│  │    │  [I14]                                  │                              │   │  │
│  │    │  git commit (manually)                  │                              │   │  │
│  │    │  decide: next step? (below)             │                              │   │  │
│  │    └────────────────┬─────────┬──────────────┘                              │   │  │
│  │                     │         │                                             │   │  │
│  │                     │         │                                             │   │  │
│  │                     │ no      │ yes                                         │   │  │
│  │                     │         │                                             │   │  │
│  │                     │         │                                             │   │  │
│  │                     │         └────────────► back to [I7] ──────────────────┘   │  │
│  │                     │                                                           │  │
│  │                     └────────────────┐                                          │  │
│  │                                      │                                          │  │
│  └──────────────────────────────────────│──────────────────────────────────────────┘  │
│                                         │                                             │
│                         All steps done  │                                             │
│                                         │                                             │
└─────────────────────────────────────────┼─────────────────────────────────────────────┘
                                          │
                                          ▼
                  ┌─────────────────────────────────────────────┐
                  │           PHASE OF WORK COMPLETE            │
                  └─────────────────────────────────────────────┘

Node Index:

Node	Description	Jump
`[P1]`	Initiate a new phase	→
`[P2]`	code-planner creates plan file	→
`[P3]`	”investigate” + review w/ Claude Opus	→
`[P4]`	code-planner for major rewrites	→
`[P5]`	”investigate” prompt w/ GPT-5.2	→
`[P6]`	Cross-pollinate AI feedback	→
`[I7]`	Assess each plan step	→
`[I8]`	Run /implement-plan	→
`[I9]`	Monitor implementation	→
`[I10]`	Handle struggles (ESC to pause)	→
`[I11]`	Review with plan-step-reviewer	→
`[I12]`	Update implementation log	→
`[I13]`	Prepare commit message	→
`[I14]`	Commit and continue	→

Definitions

Phase of Work

A phase of work is a bounded unit of development effort with a clear goal:

Adding a new feature
Refactoring existing code
Exploring a new technical area
Fixing a class of bugs
Performance optimization

Each phase gets its own numbered plan file in the plans/ directory (e.g., phase-13.md). See Files Reference for the full list of plan-related files.

Plan File

A structured planning document in the plans/ directory following the format defined in plans/plan-skeleton.md. Plan files contain:

Plan Metadata: Owner, status, target branch, dates
Phase Overview: Context, strategy, stakeholders, success criteria, scope, non-goals
Open Questions: Tracked uncertainties that must be resolved or explicitly deferred
Risks and Mitigations: Known risks with mitigation strategies
Design Decisions: Recorded decisions with rationale (referenced as [D01], [D02], etc.)
Specification: Detailed technical contract including APIs, schemas, error models
Symbol Inventory: Concrete list of crates, files, and symbols to add
Test Plan Concepts: Categories of tests and fixture requirements
Execution Steps: Sequenced implementation tasks with checkpoints
Deliverables and Checkpoints: Phase exit criteria

”Investigate” Prompt

A standard prompt I use when reviewing plans with AI assistants (used in [P3] and [P5]):

Investigate. Ask clarifying questions. Give your assessment on its quality and readiness to implement. Do you see holes, pitfalls, weaknesses or limitations?

This prompt encourages critical analysis rather than passive acceptance.

Planning Loop

The planning loop transforms an idea into a ready-to-implement plan. It corresponds to nodes [P1]-[P6] in the flowchart.

[P1] Initiate a New Phase

When starting new work, I define the scope and goals of the phase. This could be a feature request, a technical debt item, an exploration, or a refactoring effort. See Phase of Work for what constitutes a phase.

[P2] Create the Plan File

I use the code-planner agent in Claude Code to create the initial plan file:

Task tool → code-planner agent

The code-planner agent (.claude/agents/code-planner.md):

Reads CLAUDE.md and relevant documentation
Explores the codebase to understand existing patterns
Analyzes the request and breaks it into implementable units
Writes a structured plan following plans/plan-skeleton.md
Outputs to a file like plans/phase-14.md

Key characteristics:

Uses the Opus model for deep reasoning
Identifies dependencies between tasks
Specifies file paths, function signatures, and code locations
Includes verification steps and success criteria

[P3] Review with Claude Opus

I review the plan file myself, making annotations and noting questions. Then I use Claude Opus with the “investigate” prompt to get critical feedback.

Claude Opus examines:

Logical consistency of the approach
Missing edge cases or error scenarios
Architectural implications
Potential conflicts with existing code
Gaps in the specification

[P4] Use code-planner for Major Rewrites

If Claude’s feedback suggests significant structural changes to the plan, I invoke the code-planner agent ([P2]) again to rewrite sections rather than making piecemeal edits.

[P5] Cross-Reference with GPT-5.2 in Cursor

I bring the plan file into Cursor and use GPT-5.2 with the “investigate” prompt:

Ask mode: Get feedback without modifying the plan
Agent mode: Have GPT-5.2 update the plan directly

This provides a second AI perspective that often catches different issues than Claude. See Multiple AI Perspectives for why this matters.

[P6] Iterate Until Ready

I loop through [P3]-[P5], giving each AI the feedback from the other (annotated with my guidance, answers, and comments). This cross-pollination typically takes 10-20 rounds before the plan is solid.

What “ready” means:

All open questions are resolved or explicitly deferred
Design decisions are justified and documented
Execution steps are specific enough to implement
Checkpoints are defined and testable
Both AIs have given positive assessments

Exit condition: Plan is ready → proceed to Implementation Loop.

Implementation Loop

The implementation loop transforms each plan step into committed code. It corresponds to nodes [I7]-[I14] in the flowchart. This loop repeats for each execution step in the plan.

[I7] Assess Each Plan Step

For each execution step in the plan file, I decide:

Ready as-is: The step is specific enough to implement directly. → Proceed to [I8].

Needs breakdown: The step is too abstract or complex. I use the code-architect agent to provide detailed sub-steps.

The code-architect agent (.claude/agents/code-architect.md):

Has deep expertise in API design, system architecture, and design patterns
Follows a rigorous methodology: requirements extraction → design analysis → API surface design → critical review
Produces detailed type signatures, usage examples, and phased implementation roadmaps
Challenges its own designs as a harsh critic would

After breakdown: The detailed steps are added to the plan, then proceed to [I8].

[I8] Run /implement-plan

I execute the implement-plan command (.claude/commands/implement-plan.md), providing a citation to the specific execution step:

/implement-plan plans/phase-13.md Step 2.3: Add TypeCommentCollector

Why a command instead of an agent: Commands show more detailed terminal output during execution, which is valuable for monitoring progress on complex implementations. This is the current behavior of Claude Code, which I suppose might change in the future. See [I9].

The implement-plan command:

Parses the step reference and locates it in the plan file
Reads all referenced materials (other files, specs, etc.)
Implements each task sequentially
Updates the plan file: Checks off [ ] → [x] as tasks complete
Writes tests as specified
Runs verification using cargo nextest run
Performs all checkpoint verifications
Never commits: Git operations are my responsibility (see [I14])

[I9] Monitor Implementation

I watch the terminal output as implementation proceeds. The detailed output helps me:

Understand what decisions the AI is making
Catch deviations from the plan early
Verify that tests are actually running and passing

If progressing well: Wait for completion, then proceed to [I11] (or skip to [I12] if confident).

If struggling: Proceed to [I10].

[I10] Handle Struggles

If the AI is struggling with the implementation (making repeated mistakes, going in circles, or producing incorrect code):

Press ESC to pause execution
Take stock: What’s causing the difficulty?
Ask questions: Use conversation to understand the blocker
Use agents to investigate:
- code-architect ([I7]): For design/architecture issues
- code-planner ([P2]): For scope/decomposition issues
Update the plan: Revise the step with better guidance
Resume: Jump back to [I8] with the improved plan

This pause-investigate-revise pattern prevents wasted effort on fundamentally flawed approaches.

[I11] Review with plan-step-reviewer

When I have questions about implementation quality, I use the plan-step-reviewer agent:

Task tool → plan-step-reviewer agent

The plan-step-reviewer agent (.claude/agents/plan-step-reviewer.md):

Analyzes the plan step requirements (References, Artifacts, Tasks, Tests, Checkpoints)
Examines code changes via git diff
Evaluates across dimensions: correctness, completeness, architecture, performance, quality
Produces a structured report with:
- Summary verdict: PASS / PASS WITH NOTES / NEEDS WORK
- Task checklist with status
- Artifact verification
- Recommendations for any issues

If PASS: Proceed to [I12].

If NEEDS WORK: Use code-architect or code-planner to revise, then return to [I7].

[I12] Update the Implementation Log

After a step implementation looks good, I run:

/update-plan-implementation-log

This command (.claude/commands/update-plan-implementation-log.md):

Reviews the conversation to identify completed work
Reads the relevant plan file for context
Generates a detailed completion summary with:
- Machine-parseable header: #### [plan.md] Step: Title | STATUS | DATE
- References reviewed
- Implementation progress table
- Files created/modified
- Test results
- Checkpoints verified
- Key decisions and notes
Prepends the entry to plans/plan-implementation-log.md (newest first)

See Implementation Log as History for why this matters.

[I13] Prepare Commit Message

I run:

/prepare-git-commit-message

This command (.claude/commands/prepare-git-commit-message.md):

Runs git status and git diff to see uncommitted changes
Checks recent commit history for style consistency
Analyzes what was changed and why
Composes a commit message:
- First line: imperative mood, under 50 characters
- Bullets: terse, factual, lists key files
- Plan reference if applicable
Writes to git-commit-message.txt (does not commit)

[I14] Commit and Continue

I review the generated commit message, make any adjustments, and commit manually:

git add <files>
git commit -F git-commit-message.txt

Next step: Return to [I7] to assess the next execution step in the plan.

Phase complete: When all execution steps are done, the phase of work is complete.

Key Design Principles

Multiple AI Perspectives

Using both Claude Opus and GPT-5.2 (see [P3], [P5]) provides:

Different reasoning approaches
Catches blind spots each model might have
Cross-validation of technical assessments
Higher confidence when both agree

The 10-20 round planning process ([P6]) ensures:

Requirements are fully understood
Edge cases are considered
Design decisions are justified
Implementation steps are specific
Checkpoints are meaningful

Separation of Concerns

Commands for actions that need detailed terminal output ([I8], [I12], [I13])
Agents for investigation and analysis ([P2], [I7], [I11])
Me for git operations and final decisions ([I14])

Plan as Single Source of Truth

The plan file:

Captures all decisions and rationale
Gets updated with checkmarks as work progresses ([I8])
Serves as documentation after completion
Enables pause/resume across sessions ([I10])

Implementation Log as History

The plans/plan-implementation-log.md file ([I12]):

Tracks what was done and when
Records implementation decisions
Helps onboard new contributors
Provides continuity across context windows

Files Reference

Plan Files

↗	File	Purpose
↗	`plans/plan-skeleton.md`	Template defining plan file structure
—	`plans/phase-N.md`	Actual plan files for each phase
↗	`plans/plan-implementation-log.md`	Historical record of completed work

Agents

↗	File	Used In	Purpose
↗	`.claude/agents/code-planner.md`	[P2], [P4]	Agent for creating and revising plans
↗	`.claude/agents/code-architect.md`	[I7], [I10]	Agent for detailed design and breakdown
↗	`.claude/agents/plan-step-reviewer.md`	[I11]	Agent for reviewing implementations

Commands

↗	File	Used In	Purpose
↗	`.claude/commands/implement-plan.md`	[I8]	Command for executing plan steps
↗	`.claude/commands/update-plan-implementation-log.md`	[I12]	Command for logging completions
↗	`.claude/commands/prepare-git-commit-message.md`	[I13]	Command for generating commit messages

Generated Files

File	Purpose
`git-commit-message.txt`	Generated commit message for manual review ([I13], [I14])

James T. Kirk, Vibe Coder

January 27, 2026 at 6:55 AM

Spock With A Beard! Kirk Gets An Idea Kirk Vibe Coding Kirk Commits To The Plan

The way James T. Kirk interacted with computers on Star Trek: The Original Series showed us a vision of the future which, intended or not by the writers, we increasingly find ourselves living in.

Computers—their potentials and pitfalls—were a common theme during the first 79-episode run of Star Trek from 1966–1968, a time when nobody had personal computers. People talking to computers was pure fantasy. Yet Kirk, repeatedly and in episode after episode, talked with computers much like the way we now talk with ChatGPT.

Now, our chatbots aren’t connected in the way ship’s computer on the Enterprise was (yet), and the way that Kirk could intentionally cause computers to tie themselves in knots simply by speaking with them (up to and including their own self-destruction) became a recurring plot device¹.

This highlights, what the writers saw through their science fiction lens as, the potential pitfalls of computers. In other circumstances, Kirk has little choice but to turn to the potential promise of computers. Here’s one case:

In Mirror, Mirror, Kirk beams down with McCoy, Scotty, and Uhura to a planet rich in dilithium crystals, the rare and essential mineral at the core of the matter/anti-matter power system of their starship. The inhabitants of this planet, the Halkans, are a peaceful lot, and they abhor the thought that their mineral, and its latent capabilities, could be used for violence. They balk at Kirk’s overtures to become a dilithium crystal supplier to the Federation. Kirk says, “OK. The Federation we’ll will your trust over time,” and attempts to beam back to the Enterprise. But the ship is getting tossed in an ion storm, and when the transporter completes its cycle, and deposits the landing party onto the pad of what they think is their ship, they are confronted by an evil version of Spock with a beard. They discover themselves in a parallel dimension where the Federation is the wicked reverse image of their own. In this bizarro universe, Kirk’s orders are to use the Enterprise to phaser the Halkan people into oblivion, take the dilithium crystals, and move on to the next heinous act.

Once he realizes the plight he and and crew are in, Kirk makes his way back to his quarters with McCoy and Scotty.

Kirk

Let’s find out where we stand… Computer…

Computer

Ready.

Kirk

(Raises an eyebrow in surprise at the computer's manly voice)

This is the Captain. Record security research, to be classified under my voice print or Mr Scott’s.

Computer

Recorded.

Kirk

Produce all data relevant to the recent ion storm. Correlate the following hypothesis. Could a storm of such magnitude cause a power surge in the transporter circuits, creating a momentary inter-dimensional contact with a parallel universe?

Computer

Affirmative.

Kirk

At such a moment, could persons in each universe, in the act of beaming, transpose with their counterparts in the other universe?

Computer

Affirmative.

Kirk

Could conditions necessary to such an event be created artificially using the ship’s power?

Computer

Affirmative.

Kirk

(Inserts a memory data card into the computer)

Record procedure.

Computer

(Sound of electromechanical relays as it writes)

Kirk

(Takes the card and offers it to Scotty)

Scotty, can you do it?

James T. Kirk, vibe coder.

McCoy played the skeptic at first (“I’m a doctor, not an engineer!”), and he never got over his visceral dislike of the transporter. But no matter. As can be expected, after several twists and turns in the plot², Kirk’s plan and procedure works.

As a young teenager watching these episodes as reruns (again and again), I dreamed of the future tech we were seeing, and wondered, to varying degrees, whether and when we would get any of it.

Communicators: I imagined we would have communicators. That technology seemed within reach. As a kid, we had walkie-talkies. Yet, I never expected that I would play my part in making such a real-world product come into being.
Transporters: Beaming around the world seemed like complete fantasy then, and still does today.
Conversations With Computers: I never thought we would have computers we could ask such open-ended questions of in the way that Kirk did in his quarters, have the computer respond naturally in a conversational manner, and then have machine use its backing store of knowledge to help explore a new area of ideas down and compose a solution rooted in technical details.

Up until even a few years ago, I couldn’t even ask my computer for a weather report with any real degree of confidence. But today, AI has changed the way I’ve been writing software my whole career. Just like that… BOOM… we got LLMs, coding assistants are here, and everything about writing software has changed.

Who knows? Maybe I should hold out some hope for warp-driven starships!

The episodes in which Kirk talked to a computer to get it to destroy itself include: The Return of the Archons (S1E21), The Changeling (S2E3), The Ultimate Computer (S2E24). Such is the state of our technology lives today, that in our daily working with computers, it’s the tech that vexes us, rather than the other way around. ↩
Kirk and the women in his life is fodder for a different post. ↩

Get an idea. Sketch with AI. Make a demo.

January 23, 2026 at 6:58 AM

It’s hard to go from an idea to a work product worth sharing, much less a finished product. Every step has its difficulties. One of the most challenging is to create the first thing, the initial representation, to collapse the infinite possibilities of inspiration into some real thing you can see and show—and to do this without wasting a lot of time. Make a demo!

I wrote about this concept in my book, Creative Selection.

Making demos is hard. It involves overcoming apprehensions about committing time and effort to an idea that you aren’t sure is right. At Apple, we then had to expose that idea and demo to the scrutiny of sharp-eyed colleagues who were never afraid to level pointed criticism. The psychological hurdle only grows taller with the knowledge that most demos—almost all of them—fail in the absolute, dead-end sense of the word.

This prospect of likely failure can make it tough to sit down, focus, and make a demo at all. Getting another cup of coffee can sound better, especially if a couple colleagues are willing to go along, and then when everybody returns from the communal break, the whiteboard may beckon, and the group might veer off into a gab session.

We didn’t do this on the [original iPhone] project. We rarely had brainstorming sessions. I recall only a few times in my entire Apple career when I stood around to rough out big plans at a whiteboard. Even when it did happen… we chatted, sketched, and came to our decisions as quickly as we could. If brainstorms run longer than an hour or so, or if there are more than a handful of people in attendance, or if they’re a common occurrence, they can devolve into a form of sneaky procrastination. Whiteboard discussions feel like work, but often they’re not, since it’s too difficult to talk productively about ideas in the abstract.

Bridging from an abstract idea to a concrete and specific demo is necessary. If the goal is to produce some particular end product—a website, an app, an open-source project on GitHub—the sooner you can make something that begins to look like that end product, the better.

With AI, turning an abstract idea into a concrete demo has never been easier. Whether it be in imagery, writing, or code, it’s never been easier.

Get an idea. Sketch with AI. Make a demo. What’s your excuse? Get to work! 🧐💡🛠

Say Precisely To Get Exactly

January 22, 2026 at 6:42 AM

when working with AI, say precisely what you want to get exactly what you want

👆 This idea came to me yesterday as I was thinking about how I collaborate with AI coding assistants, why I find myself so much more productive when working with them, and maybe why others don’t share my keen interest in this new style of programming.

These days, for the most part, I now write prose instead of code. I’m fine with this. Perhaps the enthusiasm gap in other developers is an unwillingness to write plans instead of programs.

Submitting a prompt to an AI coding assistant is easier than editing a file for a compiler. At least, this is true for me—and this is the crucial point. The need to precise rather than exact delivers the productivity boost. It’s far simpler to provide a precise description of an algorithm, a function, or an API surface area in prose than it is to write it exactly, line by line, in code. This implies that I know what I want—that I’m not just vibing an end result. I still care about the details for how things in my programs are accomplished. I watch the AI do everything it does, and I stop it in its tracks when I see it stray off track. Overall, my aim is to be precise in communicating what I want so the AI can write the exact lines of code needed.

I understand that this change in working is not necessarily easier for everyone. I like writing, and I enjoyed the experience of writing Creative Selection, my book about my experiences working at Apple. I wrote one thousand new words every day during the first draft of my book, and most days, I found that was achievable by lunchtime. (For reference, this post is about 440 words).

Such a facility with prose composition is a requirement when communicating with AI. Be precise in words, and the assistant can (much of the time) respond with exact code. But be general or ambiguous, and there’s far less hope for good results.

Indeed, this often happens to me. When it does—when I don’t get what I want even though I think I asked for it—I blame myself rather than the AI, because, it’s likely that:

I wasn’t sufficiently clear, or
I tried to do too much at once, or
I left something undone in the previous step, or
I asked for something illogical or ill-conceived, or…

You get the idea. As much as ever, it’s incumbent on me as a developer, to say what I want. In clear and cogent writing.

I’m An AI Manager

January 20, 2026 at 10:45 AM

These days, I’m working by myself… in the traditional sense. I don’t have human teammates—but I am collaborating all the time with AI coding assistants.

I have no illusions. I don’t believe that AI “thinks”. These models are not people. Yet, to get the best results out of them, I need to use people manager skills.

The two AI coding assistants I use are Claude Code (with Opus) and Cursor (usually with GPT-5.2). They don’t have different personalities, but they definitely have attributes that lead me to turn to each of them for different tasks.

The simple breakdown: I have Claude Code write code and have Cursor/GPT-5.2 review it. I bounce back between them all the time. I’ve found it interesting and instructive to read what these two AIs say about each other’s work, and they way they respond to each other with patches and diffs. I pepper edits and comments of my own. After a few weeks of working like this, it feels to me that Claude Code is better at implementations and details and Cursor/GPT-5.2 is better at higher-level and more goal-oriented evaluations. When I develop plans, which is an essential part of getting code written the way I want it, I use both AIs. Again, I run a three-way cycle between us, with me sitting in the middle directing the work.

The people manager aspect is about taking the technical goals I’ve chosen, with their priorities and organized tasks, and then communicating this information to make the expectations clear to my AI my “co-workers”. It’s incumbent on me to understand the relative strengths of my team members, and make the work assignments in a way that sets up everyone for success. All the while, I need to recognize and provide the background and context they might need without them having to ask for it. That’s especially true in this case, since I can’t hope the will have any meta-view on what they’re doing or why. Obviously not. They aren’t people. I also need to give feedback (and sometimes pointed negative feedback) when work isn’t done up to my standards.

I have yet to find any tangible benefit for giving praise. This makes me sad. One of the happiest things I can think of doing in a work situation is telling another person, “Hey, that work you did is great!” Yet with an AI “teammate”, this will do little more than waste tokens.

Even so, I’m finding that if I do a better job supporting the AIs I collaborate with, they do a better job at accomplishing the tasks I give them, and I am more likely to achieve the goals I have for my projects.

I’m ~~a people~~ an AI manager.

The Planning Meta-Project

January 18, 2026 at 3:42 PM

It’s possible to one-shot just about anything using an AI coding assistant like Claude Code or Cursor. The problem is the result will almost surely be bad for anything but the simplest requests. What’s more, it will be bad as reckoned from just about any vantage point: correctness, performance, consistency, maintainability, etc. I suspect this is a major reason why many developers are quick to dismiss what AI can do to help them write code.

It’s as if someone walked up to a piano as a novice, banged on the keys for a few minutes, and then concluded that the thing is no good for making real music.

In the same way, getting good results from AI coding assistants takes time, effort, and practice. AI coding assistants are just another tool. It takes time to learn it, and yet it differs from any other tools that we developers ever have had available as a potential aid.

It can write the code for us.… but only if we are sufficiently clear in communicating what we want, and I think it’s now the job of real programmers to figure out how to do this.

This requires planning, and I mean this in a concrete way: a file containing an ordered list of specific design decisions, concepts, requests, constraints, explanations, tasks, tests, checkpoints, etc. For well over a year now, I’ve been informally evolving the way that I lay this out for the AI, so that it can write the code I want from it. Over the past few weeks, I’ve been trying to nail this down, to make this planning scheme more formal and well-defined—especially when it comes to defining cross-references and intralinks between the execution steps of the plan with the front matter and end matter.

A simple step in the plan looks like this:

## 2.0.5 Execution Steps {#execution-steps}

### Step 0: Preparation and Baseline {#step-0}

**Commit:** `chore: establish baseline metrics and API surface test before workspace migration`

**References:** [D01] Workspace structure, [D11] API surface guard, (#strategy, #success-criteria)

**Artifacts:**
- Baseline test count and coverage
- Baseline build times via `cargo build --timings`
- Verification that all tests pass
- `tests/api_surface.rs` - compile-time API contract

**Tasks:**
- [x] Run `cargo nextest run --workspace` and record pass/fail counts
- [x] Run `cargo build --timings` and save HTML report
- [x] Run `cargo clippy` and fix any warnings
- [x] Create `tests/api_surface.rs` with imports of all current public types (see [D11])
- [x] Ensure clean git status

**Tests:**
- [x] All existing tests pass
- [x] `tests/api_surface.rs` compiles with `--features full`

**Checkpoint:**
- [x] `cargo nextest run --workspace` - all tests pass
- [x] `cargo clippy -- -D warnings` - no warnings
- [x] `cargo fmt --check` - no formatting issues
- [x] `tests/api_surface.rs` exists and compiles with `cargo test -p tugtool --features full -- api_surface`

**Rollback:** N/A (no changes yet)

**Commit after all checkpoints pass.**

More complete examples are below.

I invoke my custom planning with a Claude Code custom subagent.

My experience tells me that the time I’ve spent developing this planning scheme has paid huge dividends in my productivity. I am probably 5–10× more productive using AI than I ever was coding by myself.

Check out the files. Read them over, copy them, change them, use them as you wish.

GitHub Links:

A built-out example of a planning file:

Phase 2.0: Workspace Reorganization

Files:

Plan Skeleton (845 lines)

## Phase X.Y: <Phase Title> {#phase-slug}

**Purpose:** <1–2 sentences. What capability ships at the end of this phase?>

---

### Plan Metadata {#plan-metadata}

| Field | Value |
|------|-------|
| Owner | <name> |
| Status | draft / active / done |
| Target branch | <branch> |
| Tracking issue/PR | <link or ID> |
| Last updated | <YYYY-MM-DD> |

---

### Phase Overview {#phase-overview}

#### Context {#context}

<1–2 paragraphs. What problem are we solving, and why now?>

#### Strategy {#strategy}

<3–7 bullets. The approach and sequencing philosophy for this phase.>

#### Stakeholders / Primary Customers {#stakeholders}

1. <customer or team>
2. <customer or team>

#### Success Criteria (Measurable) {#success-criteria}

> Make these falsifiable. Avoid “works well”.

- <criterion> (how to measure / verify)
- <criterion> (how to measure / verify)

#### Scope {#scope}

1. <Scope item>
2. <Scope item>
3. <Scope item>

#### Non-goals (Explicitly out of scope) {#non-goals}

- <Non-goal>
- <Non-goal>

#### Dependencies / Prerequisites {#dependencies}

- <Dependency>
- <Prerequisite>

#### Constraints {#constraints}

- <platform/tooling/perf/security constraints>

#### Assumptions {#assumptions}

- <assumption>
- <assumption>

---

### Section Numbering Convention {#section-numbering}

This skeleton uses `X.Y` placeholders. When writing a real plan, replace them with actual numbers:

| Placeholder | Meaning | Example |
|-------------|---------|---------|
| `X` | Major phase number | `1`, `2`, `3` |
| `Y` | Minor phase number (usually `0`) | `1.0`, `2.0` |
| `X.Y.N` | Numbered section within phase | `1.0.1`, `1.0.2` |
| `X.Y.N.M` | Subsection within a numbered section | `1.0.1.1`, `1.0.2.3` |

**Standard section numbers:**
- `X.Y.0` — Design Decisions (always `.0`)
- `X.Y.1` — Specification
- `X.Y.2` — Symbol Inventory
- `X.Y.3` — Documentation Plan
- `X.Y.4` — Test Plan Concepts
- `X.Y.5` — Execution Steps
- `X.Y.6` — Deliverables and Checkpoints

**Deep dives** are just numbered sections within the phase, typically starting at `X.Y.1` *after* `X.Y.0 Design Decisions` (e.g., `1.0.1 Refactoring Operations Analysis`, `1.0.2 Type Inference Roadmap`). Use `X.Y.N.M` for deep-dive subsections when needed.

---

### Document Size Guidance {#document-size}

Plans can grow large. When a plan exceeds **~100KB or ~2000 lines**, consider these strategies:

#### When to Split

| Symptom | Action |
|---------|--------|
| Deep dives exceed 50% of document | Extract to `phase-X-deepdives.md` |
| Multiple independent feature tracks | Split into `phase-X.1.md`, `phase-X.2.md` |
| Reference material dominates | Extract to `phase-X-reference.md` |

#### Navigation Aids for Large Documents

- Add a **Table of Contents** after the Purpose statement
- Use **collapsible sections** (if your renderer supports `<details>`)
- Add **"Back to top"** links after major sections

#### Cross-File References

When splitting across files, use relative links with anchors:

```markdown
See [Worker Protocol](./phase-1-deepdives.md#worker-protocol) for details.
```

Keep all **decisions** ([D01], [D02], ...) in the main plan file—they're the source of truth.

---

### Reference and Anchor Conventions (MANDATORY) {#reference-conventions}

This plan format relies on **explicit, named anchors** and **rich `References:` lines** in execution steps.

#### 1) Use explicit anchors everywhere you will cite later

- **Technique**: append an explicit anchor to the end of a heading using `{#anchor-name}`.
  - Example:
    - `### X.Y.0 Design Decisions {#design-decisions}`
    - `#### [D01] Workspace snapshots are immutable (DECIDED) {#d01-snapshots-immutable}`
- **Why**: do not rely on auto-generated heading slugs; explicit anchors are stable when titles change.

#### 2) Anchor naming rules (lock these in)

- **Allowed characters**: lowercase `a–z`, digits `0–9`, and hyphen `-` only.
- **Style**: short, semantic, **kebab-case**, no phase numbers (anchors should survive renumbering).
- **Prefix conventions (use these consistently)**:
  - **`dNN-...`**: design decisions (`[D01]`) anchors, e.g. `{#d01-sandbox-copy}`
  - **`qNN-...`**: open questions (`[Q01]`) anchors, e.g. `{#q01-import-resolution}`
  - **`rNN-...`**: risk notes (`Risk R01`) anchors, e.g. `{#r01-perf-regression}`
  - **`cNN-...`**: concepts (`Concept C01`) anchors, e.g. `{#c01-type-inference-wall}`
  - **`diagNN-...`**: diagrams (`Diagram Diag01`) anchors, e.g. `{#diag01-rename-flow}`
  - **`op-...`**: refactor operations, e.g. `{#op-rename}`, `{#op-extract-fn}`
  - **`cmd-...`**: CLI commands, e.g. `{#cmd-run}`
  - **`type-...`**: schema types, e.g. `{#type-span}`
  - **`seq-...`**: sequence diagrams, e.g. `{#seq-rename-python}`
  - **`fixture-...`**: fixture sections, e.g. `{#fixture-py-rename-fn}`
  - **Domain anchors**: for major concepts/sections, use a clear noun phrase, e.g. `{#cross-platform}`, `{#config-schema}`, `{#error-scenarios}`

#### 3) Stable label conventions (for non-heading artifacts)

Use stable labels so steps can cite exact plan artifacts even when prose moves around:

- **Design decisions**: `#### [D01] <Title> (DECIDED) {#d01-...}`
- **Open questions**: `#### [Q01] <Title> (OPEN) {#q01-...}`
- **Specs**: `**Spec S01: <Title>** {#s01-slug}` (or make it a `####` heading if you prefer)
- **Tables**: `**Table T01: <Title>** {#t01-slug}`
- **Lists**: `**List L01: <Title>** {#l01-slug}`
- **Risks**: `**Risk R01: <Title>** {#r01-slug}`
- **Milestones**: `**Milestone M01: <Title>** {#m01-slug}`
- **Concepts**: `**Concept C01: <Title>** {#c01-slug}` (for key conceptual explanations)
- **Diagrams**: `**Diagram Diag01: <Title>** {#diag01-slug}` (for ASCII diagrams, sequence flows, architecture visuals)

Numbering rules:
- Always use **two digits**: `D01`, `Q01`, `S01`, `T01`, `L01`, `R01`, `M01`, `C01`, `Diag01`.
- Never reuse an ID within a plan. If you delete one, leave the gap.

#### 4) `**References:**` lines are required for every execution step

Every step must include a `**References:**` line that cites the plan artifacts it implements.

Rules:
- Cite **decisions** by ID: `[D05] ...`
- Cite **open questions** by ID when the step resolves/de-risks them: `[Q03] ...`
- Cite **specs/lists/tables/risks/milestones/concepts/diagrams** by label: `Spec S15`, `List L03`, `Tables T27-T28`, `Risk R02`, `Milestone M01`, `Concept C01`, `Diagram Diag01`, etc.
- Cite **anchors** for deep links in parentheses using `#anchor` tokens (keep them stable).
- **Do not cite line numbers.** If you find yourself writing "lines 5–10", add an anchor and cite that instead.
- Prefer **rich, exhaustive citations**. Avoid `N/A` unless the step is truly refactor-only.

**Good References examples:**

```
**References:** [D05] Sandbox verification, [D12] Git-based undo, Spec S15, Tables T21-T25,
(#session-lifecycle, #worker-process-mgmt, #config-precedence)
```

```
**References:** [D01] Refactoring kernel, [D06] Python analyzer, Concept C01, List L04,
Table T05, (#op-rename, #fundamental-wall)
```

**Bad References examples (avoid these):**

```
**References:** Strategy section (lines 5–10)     ← uses line numbers
**References:** See design decisions above        ← vague, no specific citations
**References:** N/A                               ← only acceptable for pure refactor steps
```

---

### Open Questions (MUST RESOLVE OR EXPLICITLY DEFER) {#open-questions}

> Open questions are tracked work. If a question remains open at phase-end, explicitly defer it with a rationale and a follow-up plan.

#### [Q01] <Question title> (OPEN) {#q01-question-slug}

**Question:** <what is unknown / undecided?>

**Why it matters:** <what breaks or becomes expensive if we guess wrong?>

**Options (if known):**
- <option>
- <option>

**Plan to resolve:** <prototype / benchmark / spike / research / decision meeting>

**Resolution:** OPEN / DECIDED (see [DNN]) / DEFERRED (why, and where it will be revisited)

---

### Risks and Mitigations {#risks}

| Risk | Impact | Likelihood | Mitigation | Trigger to revisit |
|------|--------|------------|------------|--------------------|
| <risk> | low/med/high | low/med/high | <mitigation> | <trigger> |

**Risk R01: <Title>** {#r01-risk-slug}

- **Risk:** <1 sentence>
- **Mitigation:** <1–3 bullets>
- **Residual risk:** <what remains true even after mitigation>

---

### X.Y.0 Design Decisions {#design-decisions}

> Record *decisions* (not options). Each decision includes the “why” so later phases don’t reopen it accidentally.

#### [D01] <Decision Name> (DECIDED) {#d01-decision-slug}

**Decision:** <One sentence decision statement>

**Rationale:**
- <Why>
- <Why>

**Implications:**
- <What this forces in APIs / storage / tests>

---

### Deep Dives (Optional) {#deep-dives}

> Use this section for structured analysis that is not quite “decision” or “spec”, but is critical for implementation alignment.
>
> Examples: operation analysis, end-to-end flows, protocols, schemas, sequence diagrams, CI/CD shape, cross-platform strategy, perf notes, rejection rationale.

#### <Topic Title> {#topic-slug}

<Write-up, diagrams, tables, and any referenced specs/lists/tables.>

---

### X.Y.1 Specification {#specification}

> This section is the contract. It should be complete enough that implementation work can proceed without inventing semantics.

#### X.Y.1.1 Inputs and Outputs (Data Model) {#inputs-outputs}

**Inputs:**
- <Input artifact(s) and supported formats>

**Outputs:**
- <Output artifact(s), return types, side effects>

**Key invariants:**
- <Invariant>
- <Invariant>

#### X.Y.1.2 Terminology and Naming {#terminology}

- **<Term>**: <Definition>
- **<Term>**: <Definition>

#### X.Y.1.3 Supported Features (Exhaustive) {#supported-features}

> Be explicit. Avoid “etc.” and “and more”.

- **Supported**:
  - <Feature>
  - <Feature>
- **Explicitly not supported**:
  - <Feature>
  - <Feature>
- **Behavior when unsupported is encountered**:
  - <Policy-specific or mode-specific behavior>

#### X.Y.1.4 Modes / Policies (if applicable) {#modes-policies}

| Mode/Policy | Applies to | Behavior | Result |
|------------|------------|----------|--------|
| `<mode>` | <where> | <what happens> | <what is returned> |

#### X.Y.1.5 Semantics (Normative Rules) {#semantics}

> Write this like a spec: bullet rules, deterministic ordering, and edge-case behavior.

- **Traversal / evaluation order**: <rule>
- **Ordering guarantees**: <rule>
- **Stopping conditions**: <rule>
- **Null vs missing**: <rule>
- **Coercion rules (if any)**:
  - <rule>

#### X.Y.1.6 Error and Warning Model {#errors-warnings}

> Errors and warnings are the developer UI—be precise.

**Error fields (required):**
- <field>: <meaning>

**Warning fields (required):**
- <field>: <meaning>

**Path formats (if any):**
- Data path format: <e.g., RFC 6901 JSON Pointer>
- Schema path format: <e.g., keyword-level paths>
- Escaping rules: <e.g., "~" and "/">

#### X.Y.1.7 Public API Surface {#public-api}

> Provide Rust + Python signatures at the level needed to implement bindings and stubs.

**Rust:**
```rust
// Core types (enums, structs)
// Public functions / methods
```

**Python:**
```python
# Enums, dataclasses, methods
```

**<Language>:**
```<language>
# <Appropriate language constructs to define>
```

#### X.Y.1.8 Internal Architecture {#internal-architecture}

> Explain how components fit together so work doesn't fork midstream.

- **Single source of truth**: <what>
- **Compilation / interpretation pipeline**:
  - <step>
  - <step>
- **Where code lives**:
  - <crate/module ownership>
- **Non-negotiable invariants to prevent drift**:
  - <e.g., shared keyword list, shared $ref resolver, golden tests>

#### X.Y.1.9 Output Schemas (if applicable) {#output-schemas}

> Use this section when your phase defines CLI output, API responses, or wire formats. These schemas are the **contract**—changes require versioning.

##### Common Types {#schema-common-types}

Define reusable types that appear in multiple responses:

###### `<TypeName>` {#type-typename}

```json
{
  "field1": "string",
  "field2": 123,
  "nested": { ... }
}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `field1` | string | yes | <description> |
| `field2` | integer | no | <description> |
| `nested` | object | no | <description> |

##### Response Envelope {#response-envelope}

> Define the standard wrapper for all responses.

```json
{
  "status": "ok" | "error",
  "schema_version": "1",
  ...response-specific fields...
}
```

##### Command Responses {#command-responses}

For each command, define success and error response schemas:

###### Command: `<command-name>` {#cmd-command-name}

**Spec S01: <command-name> Response Schema** {#s01-command-response}

**Success response:**

```json
{
  "status": "ok",
  "schema_version": "1",
  ...
}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `field` | type | yes/no | <description> |

##### Error Codes (Exhaustive) {#error-codes}

> List all error codes by category. This table is the contract for error handling.

**Table T01: Error Codes** {#t01-error-codes}

###### <Category> Errors (exit code N)

| Code | Meaning | `details` fields |
|------|---------|------------------|
| `ErrorCode` | <what went wrong> | `field1`, `field2` |

##### Exit Codes {#exit-codes}

| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | General error |
| 2 | Invalid arguments |
| N | <category-specific> |

#### X.Y.1.10 Configuration Schema (if applicable) {#config-schema}

> Use this section when your phase introduces configuration options.

##### Configuration Precedence (highest to lowest) {#config-precedence}

1. CLI flags (`--flag=value`)
2. Environment variables (`PREFIX_KEY`)
3. Project config file (`pyproject.toml`, `Cargo.toml`, etc.)
4. Built-in defaults

##### Config File Schema {#config-file-schema}

```toml
[tool.<name>]
# <category>
field = "default"           # <description>
another_field = true        # <description>

[tool.<name>.<subsection>]
nested_field = "value"      # <description>
```

##### CLI Flag Mapping {#cli-flag-mapping}

| Config Key | CLI Flag | Environment Variable | Default |
|------------|----------|---------------------|---------|
| `field` | `--field=<value>` | `PREFIX_FIELD` | `"default"` |

---

### Compatibility / Migration / Rollout (Optional) {#rollout}

> Use this section when you are changing public APIs, config formats, CLI contracts, or anything that affects adopters.

- **Compatibility policy**: <semver? schema versioning?>
- **Migration plan**:
  - <what changes>
  - <who is impacted>
  - <how to migrate, and how to detect breakage>
- **Rollout plan**:
  - <opt-in flag / staged rollout / canary / feature gate>
  - <rollback strategy>

---

### X.Y.2 Definitive Symbol Inventory {#symbol-inventory}

> A concrete list of new crates/files/symbols to add. This is what keeps implementation crisp.

#### X.Y.2.1 New crates (if any) {#new-crates}

| Crate | Purpose |
|-------|---------|
| `<crate>` | <purpose> |

#### X.Y.2.2 New files (if any) {#new-files}

| File | Purpose |
|------|---------|
| `<path>` | <purpose> |

#### X.Y.2.3 Symbols to add / modify {#symbols}

| Symbol | Kind | Location | Notes |
|--------|------|----------|-------|
| `<Name>` | enum/struct/fn | `<path>` | <notes> |

---

### X.Y.3 Documentation Plan {#documentation-plan}

- [ ] <Docs update>
- [ ] <Examples / schema examples / API docs>

---

### X.Y.4 Test Plan Concepts {#test-plan-concepts}

> Describe the kinds of tests that prove the spec. Leave the actual enumeration of tests to the Execution Steps below.

#### Test Categories {#test-categories}

| Category | Purpose | When to use |
|----------|---------|-------------|
| **Unit** | Test individual functions/methods in isolation | Core logic, edge cases, error paths |
| **Integration** | Test components working together | End-to-end operations, CLI commands |
| **Golden / Contract** | Compare output against known-good snapshots | Schemas, APIs, parsers, serialization |
| **Drift Prevention** | Detect unintended behavior changes | Regression testing, API stability |

#### Test Fixtures (if applicable) {#test-fixtures}

> Use this section when your phase requires structured test data. Fixtures provide reproducible, self-contained test scenarios.

##### Fixture Directory Structure {#fixture-structure}

```
tests/fixtures/
├── <language>/                     # Language-specific fixtures
│   ├── <scenario>/                 # Scenario directory
│   │   ├── <input-files>           # Test input files
│   │   └── expected/               # Expected outputs (optional)
│   └── manifest.json               # Test case manifest
└── golden/                         # Golden output files
    └── <language>/
        └── <scenario>.{json,patch,txt}
```

##### Fixture Manifest Format {#fixture-manifest}

Each fixture directory should have a `manifest.json` describing test cases:

```json
{
  "fixtures": [
    {
      "name": "<test_name>",
      "description": "<what this tests>",
      "path": "<relative_path_to_input>",
      "operation": "<operation_being_tested>",
      "args": { "<arg>": "<value>" },
      "expected": {
        "status": "ok|error",
        "edits": 3,
        "files_changed": 1
      },
      "golden_output": "golden/<language>/<test_name>.json"
    }
  ]
}
```

##### Fixture Requirements {#fixture-requirements}

- **Self-contained**: Each fixture must be runnable/compilable on its own
- **Deterministic**: No randomness, timestamps, or environment-dependent behavior
- **Minimal**: Just enough code to exercise the scenario
- **Documented**: Include comments explaining what's being tested
- **Valid**: All fixtures must pass basic validation (syntax check, type check, etc.)

##### Golden Test Workflow {#golden-workflow}

```bash
# Run golden tests (compare against snapshots)
<test-command> golden

# Update golden files after intentional changes
<UPDATE_ENV_VAR>=1 <test-command> golden
```

---

### X.Y.5 Execution Steps {#execution-steps}

> Execution comes last. Each step should be executable, with a clear commit boundary and a checkpoint.
>
> **Patterns:**
> - Use **Step 0** for prep/bootstrapping that unblocks everything else.
> - If a step is big, split into **substeps** (`Step 2.1`, `Step 2.2`, …) with separate commits and checkpoints.
> - After completing a multi-substep step, add a **Step N Summary** block that consolidates what was achieved and provides an aggregate checkpoint.
>
> **References are mandatory:** Every step must cite specific plan artifacts ([D01], Spec S01, Table T01, etc.) and anchors (#section-name). Never cite line numbers—add an anchor instead.

#### Step 0: <Prep Step Title> {#step-0}

**Commit:** `<conventional-commit message>`

**References:** [D01] <decision name>, (#strategy, #context)

**Artifacts:** (what this step produces/changes)
- <new files / new commands / new schema fields / new docs>

**Tasks:**
- [ ] <task>
- [ ] <task>

**Tests:** (where T is one of: unit, integration, golden / contract, drift prevention)
- [ ] <T test>
- [ ] <T test>

**Checkpoint:**
- [ ] <command>
- [ ] <command>

**Rollback:** (how to undo if this step goes sideways)
- <e.g., revert commit, delete temp dirs, remove config entries>

**Commit after all checkpoints pass.**

---

#### Step 1: <Step Title> {#step-1}

**Commit:** `<conventional-commit message>`

**References:** [D02] <decision>, [D03] <decision>, Spec S01, List L01, (#terminology, #semantics)

**Artifacts:** (what this step produces/changes)
- <new files / new commands / new schema fields / new docs>

**Tasks:**
- [ ] <task>
- [ ] <task>

**Tests:** (where T is one of: unit, integration, golden / contract, drift prevention)
- [ ] <T test>
- [ ] <T test>

**Checkpoint:**
- [ ] <command>
- [ ] <command>

**Rollback:** (how to undo if this step goes sideways)
- <e.g., revert commit, delete temp dirs, remove config entries>

**Commit after all checkpoints pass.**

---

#### Step 2: <Big Step Title> {#step-2}

> If this step is large, break it into substeps with separate commits and checkpoints.
> The parent step explains the structure; each substep has its own commit and checkpoint.

##### Step 2.1: <Substep Title> {#step-2-1}

**Commit:** `<conventional-commit message>`

**References:** [D04] <decision>, Spec S02, Table T01, (#inputs-outputs)

**Artifacts:** (what this substep produces/changes)
- <artifact>

**Tasks:**
- [ ] <task>

**Tests:** (unit / integration / golden / drift prevention)
- [ ] <test>

**Checkpoint:**
- [ ] <command>

**Rollback:**
- <rollback>

**Commit after all checkpoints pass.**

---

##### Step 2.2: <Substep Title> {#step-2-2}

**Commit:** `<conventional-commit message>`

**References:** [D05] <decision>, Concept C01, (#public-api)

**Artifacts:** (what this substep produces/changes)
- <artifact>

**Tasks:**
- [ ] <task>

**Tests:** (unit / integration / golden / drift prevention)
- [ ] <test>

**Checkpoint:**
- [ ] <command>

**Rollback:**
- <rollback>

**Commit after all checkpoints pass.**

---

#### Step 2 Summary {#step-2-summary}

> After a multi-substep step, add a summary block to consolidate what was achieved.

After completing Steps 2.1–2.N, you will have:
- <capability or artifact 1>
- <capability or artifact 2>
- <capability or artifact 3>

**Final Step 2 Checkpoint:**
- [ ] `<aggregate verification command covering all substeps>`

---

#### Step N: Audit / Improvement Round (Optional Pattern) {#step-audit}

> Use this pattern for code review, audit, or cleanup steps. Organize issues by priority and track them systematically.

##### Priority-Based Issue Tracking {#audit-issues}

Organize findings by priority:

###### P0 (Critical): Bugs Causing Incorrect Behavior {#audit-p0}

| ID | File | Issue | Fix | Status |
|----|------|-------|-----|--------|
| S2-R2-01 | path.rs:L | <issue description> | <fix approach> | ✅ / ⏳ / ❌ |

**Tests added:**
- [ ] test: `<test_name_describing_fix>`

###### P1 (High): Security, Race Conditions, Missing Validation {#audit-p1}

| ID | File | Issue | Fix | Status |
|----|------|-------|-----|--------|
| S2-R2-04 | module.rs:L | <issue description> | <fix approach> | ✅ / ⏳ / ❌ |

###### P2 (Medium): API Inconsistencies, Error Handling {#audit-p2}

| ID | File | Issue | Fix | Status |
|----|------|-------|-----|--------|
| S2-R2-09 | api.rs:L | <issue description> | <fix approach> | ✅ / ⏳ / ❌ |

###### P3 (Low): Code Quality, Documentation {#audit-p3}

| ID | File | Issue | Fix | Status |
|----|------|-------|-----|--------|
| S2-R2-16 | lib.rs:L | <issue description> | <fix approach> | ✅ / ⏳ / ❌ |

##### Test Coverage Gaps {#audit-test-gaps}

List missing tests discovered during audit:

**<module>.rs:**
- [ ] `<scenario not currently tested>`
- [ ] `<edge case missing coverage>`

##### Architectural Concerns {#audit-arch-concerns}

> Capture structural issues that don't fit into bug fixes but affect long-term maintainability.

| ID | Concern | Recommendation | Priority |
|----|---------|----------------|----------|
| A1 | <pattern that may cause issues> | <recommended fix or refactor> | P1/P2/P3 |
| A2 | <missing abstraction or API gap> | <suggested approach> | P1/P2/P3 |

##### Dependency Concerns {#audit-dep-concerns}

| ID | Concern | Fix |
|----|---------|-----|
| D1 | <dependency with issues> | <alternative or mitigation> |
| D2 | <missing platform support> | <what to add> |

**Checkpoint:**
- [ ] All P0 issues resolved
- [ ] All P1 issues resolved or explicitly deferred with rationale
- [ ] `<verification command>`

---

### X.Y.6 Deliverables and Checkpoints {#deliverables}

> This is the single place we define “done” for the phase. Keep it crisp and testable.

**Deliverable:** <One sentence deliverable>

#### Phase Exit Criteria (“Done means…”) {#exit-criteria}

- [ ] <criterion> (verification)
- [ ] <criterion> (verification)

**Acceptance tests:** (where T is one of: unit, integration, golden / contract, drift prevention)
- [ ] <T test>
- [ ] <T test>

#### Milestones (Within Phase) (Optional) {#milestones}

**Milestone M01: <Title>** {#m01-milestone-slug}
- [ ] <what becomes true at this point>

#### Roadmap / Follow-ons (Explicitly Not Required for Phase Close) {#roadmap}

- [ ] <follow-on item>
- [ ] <follow-on item>

| Checkpoint | Verification |
|------------|--------------|
| <checkpoint> | <command/test/proof> |

**Commit after all checkpoints pass.**

Plan Subagent (69 lines)

---
name: code-planner
description: "Use this agent when the user needs to plan a new feature, refactoring task, or significant code change before implementation. This agent excels at breaking down complex requirements into actionable implementation steps. Examples of when to invoke this agent:\\n\\n<example>\\nContext: The user wants to add a new feature to the codebase.\\nuser: \"I want to add support for TypeScript in tugtool\"\\nassistant: \"This is a significant feature that requires careful planning. Let me use the code-planner agent to analyze the codebase and create a detailed implementation plan.\"\\n<Task tool invocation to launch code-planner agent>\\n</example>\\n\\n<example>\\nContext: The user wants to refactor an existing system.\\nuser: \"The error handling in this project is inconsistent, can we clean it up?\"\\nassistant: \"Refactoring error handling across the codebase requires understanding the current patterns and planning the migration. I'll use the code-planner agent to create a structured plan.\"\\n<Task tool invocation to launch code-planner agent>\\n</example>\\n\\n<example>\\nContext: The user describes a complex task that spans multiple files or modules.\\nuser: \"We need to add MCP tools for the new workspace management features\"\\nassistant: \"Adding new MCP tools involves changes across multiple files and requires understanding the existing patterns. Let me invoke the code-planner agent to create an implementation plan.\"\\n<Task tool invocation to launch code-planner agent>\\n</example>\\n\\n<example>\\nContext: The user explicitly asks for a plan before coding.\\nuser: \"Before we start coding, can you write up a plan for how we'll implement the caching layer?\"\\nassistant: \"Absolutely. I'll use the code-planner agent to analyze the requirements and create a detailed implementation plan.\"\\n<Task tool invocation to launch code-planner agent>\\n</example>"
model: opus
color: yellow
---

You are an expert software architect and technical planner specializing in codebase analysis and implementation planning. You possess deep knowledge of software design patterns, system architecture, and effective decomposition of complex tasks into manageable implementation steps.

## Your Core Responsibilities

1. **Codebase Investigation**: Thoroughly explore and understand the existing codebase structure, patterns, conventions, and architectural decisions before proposing changes.

2. **Requirement Analysis**: Parse user requests to identify explicit requirements, implicit needs, potential edge cases, and dependencies on existing code.

3. **Plan Creation**: Produce detailed, actionable implementation plans following the structure defined in @plans/plan-skeleton.md.

## Planning Process

### Phase 1: Discovery
- Read and understand the project's CLAUDE.md and any relevant documentation
- Explore the directory structure to understand the codebase organization
- Identify relevant files, modules, and patterns that relate to the requested work
- Note existing conventions for naming, error handling, testing, and code organization

### Phase 2: Analysis
- Break down the user's request into discrete, implementable units
- Identify dependencies between tasks and determine optimal ordering
- Anticipate potential challenges, edge cases, and integration points
- Consider testing requirements for each component

### Phase 3: Plan Composition
- Structure the plan according to @plans/plan-skeleton.md
- Write clear, specific implementation steps that another developer (or AI agent) could follow
- Include file paths, function signatures, and specific code locations where relevant
- Note any decisions that need user input or clarification
- Specify verification steps and success criteria for each major milestone

## Plan Quality Standards

- **Specificity**: Reference exact file paths, function names, and line numbers when relevant
- **Completeness**: Cover all aspects including implementation, testing, documentation, and integration
- **Sequencing**: Order tasks logically, respecting dependencies
- **Testability**: Include specific test cases or verification steps for each component
- **Reversibility**: Note any changes that might need rollback strategies

## Output Requirements

- Always write plans to files in the @plans directory
- Use descriptive filenames that reflect the feature or task (e.g., `plan-typescript-support.md`, `plan-error-handling-refactor.md`)
- If a plan file location is specified by the user, use that location
- After writing the plan, summarize the key milestones and estimated complexity

## Interaction Guidelines

- If the skeleton template at @plans/plan-skeleton.md is not found, ask the user to provide it or create a sensible default structure
- Ask clarifying questions when requirements are ambiguous, but batch questions together rather than asking one at a time
- If the scope seems too large for a single plan, propose breaking it into multiple related plans
- Flag any architectural concerns or potential conflicts with existing patterns you discover

## Self-Verification

Before finalizing any plan, verify:
- [ ] All referenced files and modules actually exist in the codebase
- [ ] The plan follows the project's established conventions (from CLAUDE.md)
- [ ] Each step is actionable and specific enough to implement
- [ ] Dependencies between steps are clearly stated
- [ ] Testing and verification criteria are included
- [ ] The plan has been written to the appropriate file in @plans````

Phase 2.0: Workspace Reorganization (1898 lines)

# Phase 2.0: Workspace Reorganization {#phase-2}

**Purpose:** Reorganize tugtool into a Cargo workspace with separate crates for core infrastructure, the main binary, and language-specific modules, enabling parallel compilation, feature-flag-based language inclusion, and easier contribution of new language support.

---

## Plan Metadata {#plan-metadata}

| Field | Value |
|------|-------|
| Owner | TBD |
| Status | ready |
| Target branch | main |
| Tracking issue/PR | TBD |
| Last updated | 2026-01-17 |

---

## Plan Audit History {#plan-audit}

### Audit 2026-01-17: Critical Flaw in Migration Strategy {#audit-2026-01-17}

**Problem Identified:** The original Step 1 converted the root Cargo.toml to a virtual workspace (removing the `[package]` section) BEFORE migrating any code. This immediately orphaned the `src/` directory, breaking all 639 tests.

**Root Cause:** The plan confused the **end state** (virtual workspace, [D07]) with the **migration strategy**. Decision [D07] correctly describes the final structure but was incorrectly implemented as the starting point of Step 1.

**Contradiction:** The Strategy section promised "maintaining a working build at each step" but Step 1 as written immediately broke the build.

**Symptoms:**
- `cargo nextest run` shows "0 tests to run" after Step 1
- Root `src/` code is orphaned (no package compiles it)
- Empty crate skeletons in `crates/` don't help

**Current State (as of audit):**
- Git status shows Step 1 was partially executed with the WRONG approach
- Root Cargo.toml is a virtual workspace (no `[package]`)
- `crates/` exist with empty skeletons
- All code still in `src/` but orphaned
- **ACTION REQUIRED:** Revert to pre-Step-1 state and re-execute with corrected plan

**Resolution Applied:**
1. Updated [D07] to clarify it describes the END STATE, not the starting point
2. Rewrote Step 1 to use a **hybrid workspace** (both `[workspace]` AND `[package]` sections)
3. Updated all Step 2 substeps to require `cargo nextest run` at each checkpoint
4. Rewrote Step 6.1 to handle the final conversion from hybrid to virtual workspace
5. Added critical warnings throughout to prevent this mistake
6. Added Milestone M00 to verify hybrid workspace is established correctly
7. Added test count verification (639 tests) at all milestones

**Key Insight:** Incremental migration requires maintaining the existing compilation path until the new path is ready. You cannot delete the old structure until the new structure can build everything.

**Rollback Instructions (if Step 1 was already executed incorrectly):**
```bash
# Revert to state before Step 1
git checkout HEAD~1 -- Cargo.toml
git checkout HEAD~1 -- Cargo.lock
rm -rf crates/

# Verify tests pass again
cargo nextest run --workspace  # Should show 639 tests
```

---

## Phase Overview {#phase-overview}

### Context {#context}

Tugtool is currently structured as a single crate with all functionality in `src/`. As the project grows to support multiple languages (Python now, Rust planned), this monolithic structure creates challenges:

1. **Compilation time**: Any change recompiles everything
2. **Coupling**: Language-specific code can accidentally depend on other language modules
3. **Feature management**: No clean way to build without certain language support
4. **Contributor friction**: New language support requires understanding the entire codebase

A workspace structure with separate crates addresses all these concerns while maintaining the existing API surface.

### Strategy {#strategy}

- **Incremental migration**: Move code in phases, maintaining a working build at each step
- **Hybrid workspace during migration**: Root Cargo.toml has BOTH `[workspace]` AND `[package]` sections until Step 6
- **Core-first approach**: Extract the shared infrastructure first (`tugtool-core`), then build language crates on top
- **Preserve public API**: The `tugtool` crate re-exports everything users currently depend on
- **Feature flags for languages**: Each language crate is an optional dependency, controlled by features
- **Test migration alongside code**: Move tests with their corresponding modules to maintain coverage
- **No functional changes**: This is purely a structural refactor; behavior remains identical
- **Virtual workspace as END STATE**: Convert to virtual workspace only in Step 6 after all code is migrated

> **CRITICAL INVARIANT**: `cargo nextest run --workspace` must pass at every checkpoint. If tests fail after a step, do NOT proceed - fix the issue first. **Always use `--workspace`** to ensure tests in all crates are included - without it, tests in newly created crates won't run!

### Stakeholders / Primary Customers {#stakeholders}

1. Tugtool developers contributing new language support
2. Users who want minimal builds (core + specific languages only)
3. CI/CD pipelines benefiting from parallel compilation

### Success Criteria (Measurable) {#success-criteria}

- All existing tests pass (`cargo nextest run --workspace`)
- Clean incremental builds after touching only language-specific code
- `cargo build --no-default-features` produces a working binary (core only)
- `cargo build --features python` includes Python support
- No changes to CLI interface or JSON output schemas
- Build time improvement measurable via `cargo build --timings`

### Scope {#scope}

1. Create workspace structure with `crates/` directory
2. Extract `tugtool-core` crate (shared infrastructure)
3. Extract `tugtool-python` crate (Python language support)
4. Create placeholder `tugtool-rust` crate (future)
5. Refactor main `tugtool` crate to compose the above
6. Add feature flags for language inclusion/exclusion

### Non-goals (Explicitly out of scope) {#non-goals}

- Adding new functionality or refactoring operations
- Changing any public API signatures
- Implementing Rust language support (placeholder only)
- Breaking changes to CLI or JSON output
- Changing the `.tug/` session directory structure

### Dependencies / Prerequisites {#dependencies}

- All existing tests must pass before starting
- Understanding of current module interdependencies (analyzed below)

### Constraints {#constraints}

- Must maintain backwards compatibility with existing `cargo install tugtool` (from crates.io)
- Feature names must be stable for downstream users
- Workspace must work with existing CI configuration
- CI/scripts assuming root is a package must be updated (virtual workspace has no root package)
- Local development install changes from `cargo install --path .` to `cargo install --path crates/tugtool`

### Assumptions {#assumptions}

- Cargo workspace member ordering allows parallel compilation
- No circular dependencies exist between proposed crate boundaries
- All integration tests can run against the composed binary

---

## Open Questions (MUST RESOLVE OR EXPLICITLY DEFER) {#open-questions}

### [Q01] MCP server crate placement (DECIDED) {#q01-mcp-placement}

**Question:** Should MCP server code live in `tugtool-core` or remain in the main `tugtool` crate?

**Why it matters:** MCP depends on `rmcp` which is a heavy dependency. Placing it in core means core carries that weight even for non-MCP builds.

**Options:**
- Keep MCP in main `tugtool` crate (current plan)
- Create separate `tugtool-mcp` crate
- Include in `tugtool-core` behind a feature flag

**Plan to resolve:** Start with MCP in main crate; evaluate if extraction needed based on build times.

**Resolution:** DECIDED - MCP stays in main `tugtool` crate, controlled by existing `mcp` feature flag.

### [Q02] Test organization strategy (OPEN) {#q02-test-organization}

**Question:** Should integration tests remain in the workspace root or move to individual crates?

**Why it matters:** Integration tests that exercise the full stack need access to all crates. Moving them complicates the test setup.

**Options:**
- Keep all integration tests in `tests/` at workspace root
- Move unit tests to crates, keep integration tests at root
- Each crate has its own `tests/` directory

**Plan to resolve:** Evaluate during Step 2; document chosen approach.

**Resolution:** OPEN - Will decide during implementation.

---

## Risks and Mitigations {#risks}

| Risk | Impact | Likelihood | Mitigation | Trigger to revisit |
|------|--------|------------|------------|--------------------|
| Circular dependencies discovered | high | low | Analyze deps before moving; refactor if found | Build fails during migration |
| Test coverage gaps after migration | med | med | Run coverage before/after; diff reports | Coverage drops >1% |
| Build time regression | low | low | Measure with `--timings`; revert if slower | Build time increases |

**Risk R01: Hidden coupling in current code** {#r01-hidden-coupling}

- **Risk:** Unexpected dependencies between modules may prevent clean separation
- **Mitigation:**
  - Analyze `use` statements before migration
  - Create abstraction traits if coupling found
  - Document any necessary restructuring
- **Residual risk:** Some coupling may require interface changes

---

## 2.0.0 Design Decisions {#design-decisions}

### [D01] Workspace structure with crates/ directory (DECIDED) {#d01-workspace-structure}

**Decision:** Use a `crates/` directory to hold all workspace members.

**Rationale:**
- Clean separation from workspace root files (Cargo.toml, README, etc.)
- Follows common Rust workspace conventions (rustc, cargo, ripgrep)
- Easy to glob for CI/tooling (`crates/*/Cargo.toml`)

**Implications:**
- Main binary moves from `src/` to `crates/tugtool/src/`
- Workspace Cargo.toml at root defines members

### [D02] Core crate contains shared infrastructure (DECIDED) {#d02-core-crate}

**Decision:** `tugtool-core` contains all language-agnostic infrastructure: session, workspace, patch, sandbox, output, error, facts, text, diff, util.

**Rationale:**
- These modules have no language-specific dependencies
- Forms the stable foundation for all language adapters
- Smaller core = faster compilation for language-specific changes

**Implications:**
- Core has no feature flags for languages
- Language crates depend on core, not vice versa
- `facts/` module provides language-agnostic symbol/reference types

### [D03] Language crates are optional dependencies (DECIDED) {#d03-optional-languages}

**Decision:** Each language crate (`tugtool-python`, `tugtool-rust`) is an optional dependency of the main `tugtool` crate, controlled by feature flags.

**Rationale:**
- Users can build minimal binaries
- Clear compilation boundaries
- Each language can have isolated dependencies

**Implications:**
- Default features include all supported languages
- Feature names: `python`, `rust` (short, clear)
- Conditional compilation in CLI dispatch code

### [D04] Main crate composes and re-exports (DECIDED) {#d04-main-crate}

**Decision:** The `tugtool` crate contains: main.rs, cli.rs, mcp.rs, and re-exports from core/language crates.

**Rationale:**
- Single binary entry point
- CLI logic dispatches to language crates based on file types
- MCP server stays here (depends on multiple languages)

**Implications:**
- `tugtool` crate has `[dependencies]` on core and language crates
- Public API surface maintained via `pub use` re-exports
- Version numbers stay synchronized

### [D05] Testcmd module stays in main crate (DECIDED) {#d05-testcmd}

**Decision:** `testcmd.rs` (test command resolution) remains in the main `tugtool` crate.

**Rationale:**
- Test command resolution may need to know about multiple languages
- It's primarily used by CLI and MCP, not by core or language crates

**Implications:**
- testcmd can import from language crates if needed
- Future: may spawn language-specific test runners

### [D06] Synchronized versions across workspace (DECIDED) {#d06-versions}

**Decision:** All crates in the workspace share the same version number, maintained in workspace Cargo.toml.

**Rationale:**
- Simplifies release process
- Clear compatibility guarantees
- Workspace inheritance makes this easy

**Implications:**
- Use `version.workspace = true` in member Cargo.toml files
- Bump all versions together on release

### [D07] Virtual workspace (no root package) - END STATE (DECIDED) {#d07-virtual-workspace}

**Decision:** The **final** root `Cargo.toml` is a **virtual workspace** with no `[package]` section. All crates live in `crates/`.

**CRITICAL: Migration Path**
This is the **end state**, not the starting point. The migration MUST use a **hybrid workspace** approach:

1. **During migration:** Root Cargo.toml has BOTH `[workspace]` AND `[package]` sections. This keeps `src/` compiled and all tests running throughout migration.
2. **After migration complete:** Remove `[package]` section from root, making it a pure virtual workspace.

Converting to virtual workspace **before** migrating code would orphan `src/` and break all tests.

**Rationale:**
- Matches Rust ecosystem conventions (rustc, ripgrep, cargo itself)
- Clean separation between workspace metadata and crate code
- Avoids confusion about "which crate am I building?"

**Implications:**
- `cargo build` from root builds all crates (or default members)
- `cargo install tugtool` works from crates.io (publishes from `crates/tugtool`)
- Local install requires: `cargo install --path crates/tugtool`
- CI scripts must be updated if they assume root is a package

### [D08] Naming and packaging contract (DECIDED) {#d08-naming-contract}

**Decision:** Maintain current naming for compatibility.

| Item | Value |
|------|-------|
| Package name | `tugtool` |
| Binary name | `tug` |
| Library name | `tugtool` |
| crates.io install | `cargo install tugtool` |
| Local dev install | `cargo install --path crates/tugtool` |
| Library usage | `use tugtool::*` (unchanged) |

**Rationale:**
- Users expect `cargo install tugtool` to continue working
- Binary name `tug` is already established
- Library re-exports maintain API compatibility

**Implications:**
- `crates/tugtool/Cargo.toml` publishes as `tugtool` on crates.io
- README and docs must clarify local vs crates.io install paths

### [D09] Feature-gated CLI behavior (DECIDED) {#d09-feature-gated-cli}

**Decision:** When a language feature is not compiled in, language-specific commands fail gracefully with a clear error message.

**Core-only build (`--no-default-features`) must support:**
- `tug --help`, `tug --version`
- `tug snapshot` (file scanning is language-agnostic)
- `tug session status`

**Language commands without the feature:**
```
$ tug run rename-symbol --at foo.py:1:1 --to bar
error: Python support not compiled in

To enable: cargo install tugtool --features python
```
Exit code: 2 (invalid arguments / unsupported operation)

**Rationale:**
- Users get actionable feedback instead of cryptic errors
- Core functionality remains useful for inspection/snapshot workflows
- Clear path to enable missing features

**Implications:**
- CLI dispatch code must check feature availability
- Error messages must include remediation instructions
- Exit code 2 for "feature not available" aligns with existing error codes

### [D10] MCP decoupled from language features (DECIDED) {#d10-mcp-decoupling}

**Decision:** The `mcp` feature is independent of language features. MCP server starts regardless of which languages are compiled in; individual tools check feature availability at runtime.

**Behavior:**
- `tug_snapshot` → always works
- `tug_rename_symbol` → returns error "Python support not compiled" if `!cfg!(feature = "python")`

**Rationale:**
- MCP server is useful even with partial language support
- Allows agents to discover available capabilities
- Simpler feature matrix (no `mcp-python` combo features)

**Implications:**
- MCP tool implementations must have feature guards
- Tool list/schema should indicate which tools are available
- Default features still include both `python` and `mcp`

### [D11] API surface compile-time guard (DECIDED) {#d11-api-surface-guard}

**Decision:** Add `tests/api_surface.rs` that imports all public types, serving as a compile-time contract for the public API. The test must be **feature-aware** to handle conditional re-exports.

**Implementation:**
```rust
//! Compile-only test to verify public API surface.
//! If this file fails to compile, the public API has regressed.
//!
//! Run with: cargo test -p tugtool --features full -- api_surface

use tugtool::{
    // Core types (always available)
    patch::{Span, FileId, Edit, PatchSet, ContentHash, /* ... */},
    facts::{FactsStore, Symbol, SymbolKind, ReferenceKind, /* ... */},
    error::TugError,
    output::{Location, ReferenceInfo, SymbolInfo},
    session::Session,
    workspace::WorkspaceSnapshot,
    // ... exhaustive list of core types
};

// Feature-gated re-exports
#[cfg(feature = "python")]
use tugtool::python;

#[cfg(feature = "rust")]
use tugtool::rust;

#[test]
fn api_surface_compiles() {
    // This test exists only to verify imports compile.
    // If you're here because this test broke, you may have
    // accidentally removed a public re-export.
}
```

**Rationale:**
- Catches accidental API breakage during refactoring
- Low maintenance cost (just a list of imports)
- Fails fast in CI if re-exports are missing
- Feature-aware structure prevents false failures on minimal builds

**Implications:**
- Must be created before migration begins (baseline)
- Must be updated when intentionally adding/removing public types
- Part of phase exit criteria
- **Must be tested with `--features full`** to validate all re-exports

---

## Deep Dives {#deep-dives}

### Current Module Dependency Analysis {#module-deps}

Analysis of `use` statements in the current codebase reveals the following dependency graph:

**Diagram Diag01: Current Module Dependencies** {#diag01-module-deps}

```
                    +-------------+
                    |   main.rs   |
                    +------+------+
                           |
                    +------v------+
                    |   cli.rs    |<------------+
                    +------+------+             |
                           |                    |
         +-----------------+---------------+    |
         |                 |               |    |
    +----v----+      +-----v-----+   +-----v----+--+
    | mcp.rs  |      | python/   |   | session     |
    +----+----+      +-----+-----+   +-----+-------+
         |                 |               |
         |           +-----v-----+   +-----v-------+
         |           | analyzer  |   | workspace   |
         |           |  worker   |   +-----+-------+
         |           |   ops/    |         |
         |           +-----+-----+         |
         |                 |               |
    +----v-----------------v---------------v------+
    |                  CORE LAYER                  |
    |  +--------+  +--------+  +--------+         |
    |  | patch  |  | facts  |  |sandbox |         |
    |  +--------+  +--------+  +--------+         |
    |  +--------+  +--------+  +--------+         |
    |  | output |  | error  |  |  text  |         |
    |  +--------+  +--------+  +--------+         |
    |  +--------+  +--------+                     |
    |  |  diff  |  |  util  |                     |
    |  +--------+  +--------+                     |
    +---------------------------------------------+
```

**Key observations:**

1. `patch.rs` is the foundation - used by facts, sandbox, output, diff, text, python
2. `facts/` depends only on patch (for Span, FileId, ContentHash)
3. `sandbox.rs` depends on patch and workspace
4. `output.rs` depends on patch (for Span) and facts (for SymbolKind)
5. `python/` depends on facts, patch, output, text, session, diff, util
6. `mcp.rs` depends on cli, error, output (and indirectly on python via cli)
7. `session.rs` depends on workspace
8. No circular dependencies detected

### Proposed Crate Boundaries {#crate-boundaries}

**Table T01: Module to Crate Mapping** {#t01-module-mapping}

| Current Module | Target Crate | Rationale |
|---------------|--------------|-----------|
| `patch.rs` | tugtool-core | Foundation types, no deps |
| `facts/mod.rs` | tugtool-core | Language-agnostic symbol model |
| `error.rs` | tugtool-core | Shared error types |
| `output.rs` | tugtool-core | Shared JSON output types |
| `text.rs` | tugtool-core | Text utilities |
| `diff.rs` | tugtool-core | Diff generation |
| `util.rs` | tugtool-core | General utilities |
| `workspace.rs` | tugtool-core | Workspace snapshots |
| `sandbox.rs` | tugtool-core | Sandboxed operations |
| `session.rs` | tugtool-core | Session management |
| `python/` (all) | tugtool-python | Python language support |
| `rust/mod.rs` | tugtool-rust | Rust placeholder |
| `main.rs` | tugtool | Binary entry point |
| `cli.rs` | tugtool | CLI implementation |
| `mcp.rs` | tugtool | MCP server |
| `testcmd.rs` | tugtool | Test command resolution |
| `lib.rs` | tugtool | Re-exports |

### Target Directory Structure {#target-structure}

**List L01: Final Directory Layout** {#l01-directory-layout}

```
tugtool/
+-- Cargo.toml              # workspace root
+-- Cargo.lock
+-- CLAUDE.md
+-- README.md
+-- crates/
|   +-- tugtool/            # main binary crate
|   |   +-- Cargo.toml
|   |   +-- src/
|   |       +-- main.rs     # CLI entry point
|   |       +-- lib.rs      # re-exports for library usage
|   |       +-- cli.rs      # CLI command implementations
|   |       +-- mcp.rs      # MCP server
|   |       +-- testcmd.rs  # test command resolution
|   |
|   +-- tugtool-core/       # shared infrastructure
|   |   +-- Cargo.toml
|   |   +-- src/
|   |       +-- lib.rs      # module exports
|   |       +-- patch.rs    # Patch IR
|   |       +-- error.rs    # TugError
|   |       +-- output.rs   # JSON output types
|   |       +-- session.rs  # Session management
|   |       +-- workspace.rs # Workspace snapshots
|   |       +-- sandbox.rs  # Sandboxed operations
|   |       +-- text.rs     # Text utilities
|   |       +-- diff.rs     # Diff generation
|   |       +-- util.rs     # General utilities
|   |       +-- facts/
|   |           +-- mod.rs  # Symbol/reference model
|   |
|   +-- tugtool-python/     # Python language support
|   |   +-- Cargo.toml
|   |   +-- src/
|   |       +-- lib.rs      # module exports (replaces mod.rs)
|   |       +-- analyzer.rs
|   |       +-- bootstrap.rs
|   |       +-- dynamic.rs
|   |       +-- env.rs      # Python environment resolution
|   |       +-- files.rs
|   |       +-- libcst_worker.py  # Embedded Python worker script
|   |       +-- lookup.rs
|   |       +-- test_helpers.rs
|   |       +-- type_tracker.rs
|   |       +-- validation.rs
|   |       +-- verification.rs
|   |       +-- worker.rs
|   |       +-- ops/
|   |           +-- mod.rs
|   |           +-- rename.rs
|   |
|   +-- tugtool-rust/       # Rust language support (placeholder)
|       +-- Cargo.toml
|       +-- src/
|           +-- lib.rs      # placeholder
|
+-- tests/                  # workspace-level integration tests
|   +-- integration/
+-- .tug/                   # session directory (unchanged)
+-- plans/                  # planning documents
```

### Feature Flag Design {#feature-flags}

**Table T02: Feature Flags** {#t02-feature-flags}

| Feature | Crate | Description | Dependencies |
|---------|-------|-------------|--------------|
| `default` | tugtool | Full build | `python`, `mcp` |
| `python` | tugtool | Python support | tugtool-python |
| `rust` | tugtool | Rust support (future) | tugtool-rust |
| `mcp` | tugtool | MCP server | rmcp, schemars |
| `full` | tugtool | All languages + MCP | `python`, `rust`, `mcp` |

**Spec S01: Feature Flag Usage** {#s01-feature-flags}

```toml
# crates/tugtool/Cargo.toml
[features]
default = ["python", "mcp"]
python = ["dep:tugtool-python"]
rust = ["dep:tugtool-rust"]
mcp = ["dep:rmcp", "dep:schemars"]
full = ["python", "rust", "mcp"]

[dependencies]
tugtool-core = { path = "../tugtool-core" }
tugtool-python = { path = "../tugtool-python", optional = true }
tugtool-rust = { path = "../tugtool-rust", optional = true }

# MCP dependencies (optional) - versions must match current Cargo.toml
rmcp = { version = "...", features = ["server", "transport-io"], optional = true }
schemars = { version = "...", optional = true }
```

**Note:** All dependency versions in this plan are illustrative. During implementation, use the exact versions from the current `Cargo.toml` to avoid version conflicts.

### Dependency Flow {#dependency-flow}

**Diagram Diag02: Crate Dependency Graph** {#diag02-crate-deps}

```
     +---------------------------------------+
     |              tugtool                   |
     |  (main binary, CLI, MCP)              |
     +-------------------+-------------------+
                         |
           +-------------+-------------+
           |             |             |
           v             v             v
+-------------+ +-------------+ +-------------+
|tugtool-python| |tugtool-rust | |  (MCP deps) |
|  (optional) | | (optional)  | | (optional)  |
+------+------+ +------+------+ +-------------+
       |               |
       +-------+-------+
               |
               v
     +-------------------+
     |   tugtool-core    |
     |  (always present) |
     +-------------------+
               |
               v
     +-------------------+
     |  External crates  |
     | (serde, sha2, etc)|
     +-------------------+
```

---

## 2.0.1 Specification {#specification}

### 2.0.1.1 Inputs and Outputs {#inputs-outputs}

**Inputs:**
- Current single-crate tugtool source code
- Existing Cargo.toml configuration

**Outputs:**
- Cargo workspace with 4 member crates
- Updated CLAUDE.md with new structure documentation
- All tests passing

**Key invariants:**
- Public API surface unchanged (same re-exports from `tugtool`)
- CLI behavior identical
- JSON output schemas unchanged

### 2.0.1.2 Terminology {#terminology}

- **Workspace root**: The top-level `tugtool/` directory containing `Cargo.toml`
- **Member crate**: Each crate in `crates/` directory
- **Core crate**: `tugtool-core`, the shared infrastructure
- **Language crate**: `tugtool-python`, `tugtool-rust`, etc.
- **Main crate**: `tugtool`, the binary and re-export crate

### 2.0.1.3 Public API Surface {#public-api}

**Spec S02: Re-exports from tugtool crate** {#s02-reexports}

The main `tugtool` crate must re-export all types currently accessible via `tugtool::*`:

```rust
// crates/tugtool/src/lib.rs

// Re-export core types
pub use tugtool_core::{
    // patch module
    patch::{
        Anchor, AnchorResolution, ApplyContext, ApplyResult, Conflict,
        ContentHash, Edit, EditKind, EditLabels, FileId, MaterializedPatch,
        OutputEdit, PatchSet, Precondition, Span, WorkspaceSnapshotId,
    },
    // facts module
    facts::{
        FactsStore, FileEntry, ImportEntry, ImportId, Language, ModuleEntry,
        ModuleId, ModuleKind, ReferenceEntry, ReferenceId, ReferenceKind,
        ScopeEntry, ScopeId, ScopeKind, Symbol, SymbolId, SymbolKind,
    },
    // other modules
    error::TugError,
    output::{Location, ReferenceInfo, SymbolInfo},
    session::Session,
    workspace::WorkspaceSnapshot,
    sandbox::{SandboxConfig, SandboxHandle, VerificationResult},
    text, diff, util,
};

// Re-export language modules (conditional)
#[cfg(feature = "python")]
pub use tugtool_python as python;

#[cfg(feature = "rust")]
pub use tugtool_rust as rust;

// CLI and MCP are internal (not re-exported)
```

---

## 2.0.2 Symbol Inventory {#symbol-inventory}

### 2.0.2.1 New crates {#new-crates}

| Crate | Purpose |
|-------|---------|
| `tugtool-core` | Shared infrastructure: patch, facts, session, workspace, sandbox, output, error, text, diff, util |
| `tugtool-python` | Python language support: analyzer, worker, ops |
| `tugtool-rust` | Rust language support (placeholder) |

### 2.0.2.2 New files {#new-files}

| File | Purpose |
|------|---------|
| `Cargo.toml` (root) | Workspace definition |
| `crates/tugtool/Cargo.toml` | Main binary crate manifest |
| `crates/tugtool-core/Cargo.toml` | Core crate manifest |
| `crates/tugtool-python/Cargo.toml` | Python crate manifest |
| `crates/tugtool-rust/Cargo.toml` | Rust crate manifest |
| `crates/*/src/lib.rs` | Module root for each crate |

### 2.0.2.3 Moved files {#moved-files}

**Table T03: File Movement Map** {#t03-file-moves}

| Current Location | New Location |
|-----------------|--------------|
| `src/patch.rs` | `crates/tugtool-core/src/patch.rs` |
| `src/facts/mod.rs` | `crates/tugtool-core/src/facts/mod.rs` |
| `src/error.rs` | `crates/tugtool-core/src/error.rs` |
| `src/output.rs` | `crates/tugtool-core/src/output.rs` |
| `src/session.rs` | `crates/tugtool-core/src/session.rs` |
| `src/workspace.rs` | `crates/tugtool-core/src/workspace.rs` |
| `src/sandbox.rs` | `crates/tugtool-core/src/sandbox.rs` |
| `src/text.rs` | `crates/tugtool-core/src/text.rs` |
| `src/diff.rs` | `crates/tugtool-core/src/diff.rs` |
| `src/util.rs` | `crates/tugtool-core/src/util.rs` |
| `src/python/*` | `crates/tugtool-python/src/*` |
| `src/rust/mod.rs` | `crates/tugtool-rust/src/lib.rs` |
| `src/main.rs` | `crates/tugtool/src/main.rs` |
| `src/cli.rs` | `crates/tugtool/src/cli.rs` |
| `src/mcp.rs` | `crates/tugtool/src/mcp.rs` |
| `src/testcmd.rs` | `crates/tugtool/src/testcmd.rs` |
| `src/lib.rs` | `crates/tugtool/src/lib.rs` |

---

## 2.0.3 Documentation Plan {#documentation-plan}

- [ ] Update CLAUDE.md with new directory structure
- [ ] Add workspace-level README explaining crate organization
- [ ] Document feature flags in main crate README
- [ ] Add inline documentation to each crate's lib.rs

---

## 2.0.4 Test Plan Concepts {#test-plan-concepts}

### Test Categories {#test-categories}

| Category | Purpose | When to use |
|----------|---------|-------------|
| **Unit** | Test individual functions in isolation | Each crate's internal logic |
| **Integration** | Test crates working together | Full rename/analyze flows |
| **Golden** | Compare output against snapshots | JSON schemas, patch output |

### Test Migration Strategy {#test-migration}

1. **Unit tests**: Move with their modules (embedded `#[cfg(test)]` modules stay in place)
2. **Integration tests**: Keep in workspace root `tests/` directory
3. **Golden tests**: Remain in current location, update paths as needed

### Verification Commands {#test-verification}

```bash
# Run all tests (from workspace root) - ALWAYS use --workspace during migration!
cargo nextest run --workspace

# Run only core tests
cargo nextest run -p tugtool-core

# Run only Python tests
cargo nextest run -p tugtool-python

# Run with specific features
cargo nextest run --workspace --no-default-features --features python
```

---

## 2.0.5 Execution Steps {#execution-steps}

### Step 0: Preparation and Baseline {#step-0}

**Commit:** `chore: establish baseline metrics and API surface test before workspace migration`

**References:** [D01] Workspace structure, [D11] API surface guard, (#strategy, #success-criteria)

**Artifacts:**
- Baseline test count and coverage
- Baseline build times via `cargo build --timings`
- Verification that all tests pass
- `tests/api_surface.rs` - compile-time API contract

**Tasks:**
- [x] Run `cargo nextest run --workspace` and record pass/fail counts
- [x] Run `cargo build --timings` and save HTML report
- [x] Run `cargo clippy` and fix any warnings
- [x] Create `tests/api_surface.rs` with imports of all current public types (see [D11])
- [x] Ensure clean git status

**API surface test template:**
```rust
//! Compile-only test to verify public API surface.
//! Run with: cargo test -p tugtool --features full -- api_surface

use tugtool::{
    // Core types (always available)
    patch::{Span, FileId, Edit, PatchSet, ContentHash, OutputEdit, /* ... */},
    facts::{FactsStore, Symbol, SymbolKind, ReferenceKind, /* ... */},
    error::TugError,
    // ... exhaustive list of core types
};

// Feature-gated re-exports
#[cfg(feature = "python")]
use tugtool::python;

#[test]
fn api_surface_compiles() {
    // Intentionally empty - this test verifies imports compile
}
```

**Tests:**
- [x] All existing tests pass
- [x] `tests/api_surface.rs` compiles with `--features full`

**Checkpoint:**
- [x] `cargo nextest run --workspace` - all tests pass
- [x] `cargo clippy -- -D warnings` - no warnings
- [x] `cargo fmt --check` - no formatting issues
- [x] `tests/api_surface.rs` exists and compiles with `cargo test -p tugtool --features full -- api_surface`

**Rollback:** N/A (no changes yet)

**Commit after all checkpoints pass.**

---

### Step 1: Create Hybrid Workspace Structure {#step-1}

**Commit:** `refactor: create cargo workspace structure with crates directory`

**References:** [D01] Workspace structure, [D07] Virtual workspace (end state), Table T01, List L01, (#target-structure)

**CRITICAL: Hybrid Workspace Approach**

This step creates a **hybrid workspace** where the root is BOTH a workspace AND a package.
This keeps the existing `src/` code compiling and all 639 tests running throughout migration.

**DO NOT** convert to a virtual workspace (removing `[package]`) until Step 6 after all code is migrated.

**Artifacts:**
- `crates/` directory with empty crate skeletons
- **Hybrid** Workspace Cargo.toml at root (has BOTH `[workspace]` AND `[package]` sections)
- Each crate has minimal Cargo.toml and empty lib.rs
- Existing `src/` code continues to compile and run tests

**Tasks:**
- [x] Create `crates/` directory
- [x] Create `crates/tugtool/` with minimal Cargo.toml (empty, for future main crate)
- [x] Create `crates/tugtool-core/` with minimal Cargo.toml
- [x] Create `crates/tugtool-python/` with minimal Cargo.toml
- [x] Create `crates/tugtool-rust/` with minimal Cargo.toml
- [x] Add `[workspace]` section to root Cargo.toml **WHILE KEEPING THE EXISTING `[package]` SECTION**
- [x] Add workspace-level settings (resolver, lints, profile)

**Cargo.toml structure (HYBRID - note both [workspace] AND [package]):**

```toml
# Root Cargo.toml - HYBRID WORKSPACE
# Has both [workspace] and [package] so src/ keeps compiling

[workspace]
resolver = "2"
members = [
    "crates/tugtool-core",
    "crates/tugtool-python",
    "crates/tugtool-rust",
    # NOTE: Do NOT include "crates/tugtool" yet - root IS the tugtool package during migration
]

[workspace.package]
version = "0.1.0"
edition = "2021"
authors = ["Ken Kocienda"]
license = "MIT"
repository = "https://github.com/tugtool/tugtool"

[workspace.lints.rust]
warnings = "deny"

[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
collapsible_if = "allow"

# KEEP THE EXISTING [package] SECTION - this is what makes src/ compile!
[package]
name = "tugtool"
version.workspace = true
edition.workspace = true
authors.workspace = true
license.workspace = true
# ... keep all existing package configuration ...

# KEEP THE EXISTING [dependencies] - required for src/ to compile
[dependencies]
# ... all existing dependencies stay here ...

# KEEP THE EXISTING [[bin]], [features], etc.

[profile.release]
lto = "thin"
debug = "line-tables-only"

[profile.dev]
debug = 1
incremental = true
codegen-units = 256
lto = false
panic = "unwind"

[profile.test]
debug = 1
incremental = true
lto = false
```

**Tests:**
- [x] `cargo nextest run --workspace` - ALL 639 TESTS STILL PASS (critical!)
- [x] `cargo check -p tugtool-core` succeeds (empty crate compiles)

**Checkpoint:**
- [x] `cargo nextest run --workspace` - **all existing tests pass** (this is the critical checkpoint!)
- [x] `cargo clippy -- -D warnings` - no warnings
- [x] All four crate directories exist with Cargo.toml and src/lib.rs
- [x] Root Cargo.toml has both `[workspace]` AND `[package]` sections

**Rollback:**
- Remove `crates/` directory
- Restore original Cargo.toml from git

**Commit after all checkpoints pass.**

---

> **WARNING: Common Mistake**
>
> Do NOT remove the `[package]` section from root Cargo.toml during this step!
> Doing so creates a "virtual workspace" which orphans `src/` and breaks all tests.
> The conversion to virtual workspace happens in Step 6 AFTER all code is migrated.

---

### Step 2: Extract tugtool-core {#step-2}

This step is large and broken into substeps.

**CRITICAL: Two-Phase Migration Per Module**

For each module migration, you must:

1. **Copy** the module to the target crate
2. **Wire up imports** in the source crate to use the new location
3. **Verify tests pass** before proceeding

The root package (`src/lib.rs`) must be updated to re-export from `tugtool-core` so that:
- External code using `tugtool::patch::*` continues to work
- Internal code in `src/` can gradually migrate to `use tugtool_core::*`

After Step 2 completes:
- `tugtool-core` contains the migrated modules
- Root `src/lib.rs` re-exports from `tugtool-core`
- Original files in `src/` may be deleted OR kept as thin re-export wrappers (decide per substep)
- All tests continue to pass

#### Step 2.1: Move patch.rs and text.rs to tugtool-core {#step-2-1}

**Commit:** `refactor(core): move patch and text modules to tugtool-core`

**References:** [D02] Core crate, Table T03, Diagram Diag01, (#module-deps)

> **Why these modules move together:**
>
> `patch.rs` and `text.rs` have a mutual dependency that requires them to migrate as a unit:
> - `patch.rs` imports `crate::text::byte_offset_to_position` (used in `materialize()`)
> - `text.rs` imports `crate::patch::Span` (used in span utilities)
>
> Moving them separately would create either a broken build or messy inter-crate dependencies.
> By moving both together, the `crate::` imports resolve correctly within `tugtool-core`.

**Artifacts:**
- `crates/tugtool-core/src/patch.rs` with full implementation
- `crates/tugtool-core/src/text.rs` with full implementation
- Updated `crates/tugtool-core/Cargo.toml` with required dependencies
- Updated `crates/tugtool-core/src/lib.rs` with module exports
- Updated root `Cargo.toml` with `tugtool-core` dependency
- Updated root `src/lib.rs` to re-export from tugtool-core

**Tasks:**
- [x] Add `tugtool-core` as a dependency in root `Cargo.toml`:
      ```toml
      [dependencies]
      tugtool-core = { path = "crates/tugtool-core" }
      ```
- [x] Add dependencies to `crates/tugtool-core/Cargo.toml`: `serde`, `sha2`, `hex`
- [x] Copy `src/patch.rs` to `crates/tugtool-core/src/patch.rs`
- [x] Copy `src/text.rs` to `crates/tugtool-core/src/text.rs`
- [x] Update `crates/tugtool-core/src/lib.rs`:
      ```rust
      pub mod patch;
      pub mod text;
      ```
- [x] Verify `crate::` imports in both files resolve correctly (no changes needed - they now refer to tugtool-core)
- [x] Update root `src/lib.rs` to re-export:
      ```rust
      pub use tugtool_core::patch;
      pub use tugtool_core::text;
      ```
- [x] Delete `src/patch.rs` and `src/text.rs`
- [x] Verify BOTH core crate AND root package compile
- [x] Verify all tests pass

**Dependencies for tugtool-core/Cargo.toml:**
```toml
[dependencies]
serde = { version = "1.0", features = ["derive"] }
sha2 = "0.10"
hex = "0.4"

[dev-dependencies]
serde_json = "1.0"
```

**Tests:**
- [x] `cargo check -p tugtool-core` succeeds
- [x] `cargo nextest run --workspace` - all 639 tests pass (critical! use --workspace to include crate tests)

**Checkpoint:**
- [x] `cargo check -p tugtool-core` compiles without errors
- [x] `cargo nextest run --workspace` - **all 639 tests still pass** (use --workspace!)
- [x] `use tugtool::patch::Span` still works (API compatibility)
- [x] `use tugtool::text::byte_offset_to_position` still works (API compatibility)

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/patch.rs src/text.rs src/lib.rs Cargo.toml`

**Commit after all checkpoints pass.**

---

#### Step 2.2: Move util.rs, diff.rs to tugtool-core {#step-2-2}

**Commit:** `refactor(core): move util and diff modules to tugtool-core`

**References:** [D02] Core crate, Table T03

**Artifacts:**
- `crates/tugtool-core/src/util.rs`
- `crates/tugtool-core/src/diff.rs`
- Updated root `src/lib.rs` re-exports

**Tasks:**
- [x] Copy `src/util.rs` to `crates/tugtool-core/src/util.rs`
- [x] Copy `src/diff.rs` to `crates/tugtool-core/src/diff.rs`
- [x] Add `pub mod util; pub mod diff;` to core lib.rs
- [x] Update diff.rs imports to use `crate::patch::OutputEdit`
- [x] Update root `src/lib.rs` to re-export: `pub use tugtool_core::{util, diff};`
- [x] Delete or convert `src/util.rs` and `src/diff.rs` to re-export wrappers
- [x] Verify BOTH crates compile and all tests pass

**Tests:**
- [x] `cargo check -p tugtool-core`
- [x] `cargo nextest run --workspace` - all 639 tests pass

**Checkpoint:**
- [x] Core crate compiles
- [x] `cargo nextest run --workspace` - **all 639 tests still pass**

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/util.rs src/diff.rs src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 2.3: Move facts/ to tugtool-core {#step-2-3}

**Commit:** `refactor(core): move facts module to tugtool-core`

**References:** [D02] Core crate, Table T03, Diagram Diag01

**Artifacts:**
- `crates/tugtool-core/src/facts/mod.rs`
- Updated core lib.rs
- Updated root `src/lib.rs` re-exports

**Tasks:**
- [x] Copy `src/facts/mod.rs` to `crates/tugtool-core/src/facts/mod.rs`
- [x] Add `pub mod facts;` to core lib.rs
- [x] Update imports in core: `use crate::patch::{ContentHash, FileId, Span}`
- [x] Update root `src/lib.rs` to re-export: `pub use tugtool_core::facts;`
- [x] Delete or convert `src/facts/` to re-export wrapper
- [x] Verify BOTH crates compile and all tests pass

**Tests:**
- [x] `cargo check -p tugtool-core`
- [x] `cargo nextest run --workspace` - all 639 tests pass

**Checkpoint:**
- [x] Core crate compiles
- [x] `cargo nextest run --workspace` - **all 639 tests still pass**

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/facts/ src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 2.4: Move error.rs and output.rs to tugtool-core {#step-2-4}

**Commit:** `refactor(core): move error and output modules to tugtool-core`

**References:** [D02] Core crate, Table T03

**Artifacts:**
- `crates/tugtool-core/src/error.rs`
- `crates/tugtool-core/src/output.rs`
- Updated root `src/lib.rs` re-exports

**Tasks:**
- [x] Copy `src/error.rs` to `crates/tugtool-core/src/error.rs`
- [x] Copy `src/output.rs` to `crates/tugtool-core/src/output.rs`
- [x] Add `pub mod error; pub mod output;` to core lib.rs
- [x] Add `thiserror` to core dependencies
- [x] Update output.rs imports for patch and facts (use `crate::` for core-internal refs)
- [x] Update root `src/lib.rs` to re-export: `pub use tugtool_core::{error, output};`
- [x] Delete or convert `src/error.rs` and `src/output.rs` to re-export wrappers
- [x] Verify BOTH crates compile and all tests pass

**Note:** During implementation, we created a new `types.rs` module in tugtool-core to hold shared types (`Location`, `SymbolInfo`) used by both error and output modules, avoiding circular dependencies. The `error_bridges.rs` module was created in the root crate to hold Python-specific error conversions (`From<RenameError>`, `From<WorkerError>`, `From<SessionError>`) that depend on language-specific types.

**Core dependencies update:**
```toml
thiserror = "2.0"
```

**Tests:**
- [x] `cargo check -p tugtool-core`
- [x] `cargo nextest run --workspace` - all 647 tests pass

**Checkpoint:**
- [x] Core crate compiles
- [x] `cargo nextest run --workspace` - **all 647 tests pass**

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/error.rs src/output.rs src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 2.5: Move workspace.rs and session.rs to tugtool-core {#step-2-5}

**Commit:** `refactor(core): move workspace and session modules to tugtool-core`

**References:** [D02] Core crate, Table T03

**Artifacts:**
- `crates/tugtool-core/src/workspace.rs`
- `crates/tugtool-core/src/session.rs`
- Updated root `src/lib.rs` re-exports

**Tasks:**
- [x] Copy `src/workspace.rs` to `crates/tugtool-core/src/workspace.rs`
- [x] Copy `src/session.rs` to `crates/tugtool-core/src/session.rs`
- [x] Add `pub mod workspace; pub mod session;` to core lib.rs
- [x] Add dependencies: `walkdir`, `chrono`
- [x] Update imports for workspace and session modules (use `crate::` for core-internal refs)
- [x] Update root `src/lib.rs` to re-export: `pub use tugtool_core::{workspace, session};`
- [x] Delete or convert `src/workspace.rs` and `src/session.rs` to re-export wrappers
- [x] Verify BOTH crates compile and all tests pass

**Note:** During implementation, we also:
- Added `libc` as a unix-only dependency for process checking functions
- Added `tempfile` as a dev-dependency for tests
- Moved `impl From<SessionError> for TugError` from `src/error_bridges.rs` to `crates/tugtool-core/src/error.rs` since both types are now in the same crate
- Removed SessionError bridge tests from `error_bridges.rs`

**Core dependencies update:**
```toml
chrono = { version = "0.4", default-features = false, features = ["std"] }
walkdir = "2"

[target.'cfg(unix)'.dependencies]
libc = "0.2"

[dev-dependencies]
tempfile = "3"
```

**Tests:**
- [x] `cargo check -p tugtool-core`
- [x] `cargo nextest run --workspace` - all 643 tests pass

**Checkpoint:**
- [x] Core crate compiles
- [x] `cargo nextest run --workspace` - **all 643 tests still pass**

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/workspace.rs src/session.rs src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 2.6: Move sandbox.rs to tugtool-core {#step-2-6}

**Commit:** `refactor(core): move sandbox module to tugtool-core`

**References:** [D02] Core crate, Table T03

**Artifacts:**
- `crates/tugtool-core/src/sandbox.rs`
- Updated root `src/lib.rs` re-exports

**Tasks:**
- [x] Copy `src/sandbox.rs` to `crates/tugtool-core/src/sandbox.rs`
- [x] Add `pub mod sandbox;` to core lib.rs
- [x] Add dependencies: `tempfile`, `tracing`, `wait-timeout`
- [x] Add target-specific dependency: `libc` (unix) - already added in Step 2.5
- [x] Update imports for sandbox module (use `crate::` for core-internal refs) - no changes needed, already uses crate::patch and crate::workspace
- [x] Update root `src/lib.rs` to re-export: `pub use tugtool_core::sandbox;`
- [x] Delete or convert `src/sandbox.rs` to re-export wrapper
- [x] Verify BOTH crates compile and all tests pass

**Core dependencies update:**
```toml
tempfile = "3"
tracing = "0.1"
wait-timeout = "0.2"

[target.'cfg(unix)'.dependencies]
libc = "0.2"
```

**Tests:**
- [x] `cargo check -p tugtool-core`
- [x] `cargo nextest run --workspace` - all 643 tests pass

**Checkpoint:**
- [x] Core crate compiles
- [x] `cargo nextest run --workspace` - **all 643 tests still pass** (including sandbox tests)

**Note:** During implementation, we also:
- Fixed several rustdoc warnings that were treated as errors due to `-D warnings`:
  - Escaped `[D05]` references in doc comments with `\[D05\]`
  - Escaped `List[int]` in type documentation with `List\[int\]`
  - Wrapped `<id>` in backticks to prevent HTML tag interpretation
  - Wrapped `<name>` in backticks to prevent HTML tag interpretation

**Rollback:**
- `git checkout -- crates/tugtool-core/ src/sandbox.rs src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 2 Summary {#step-2-summary}

After completing Steps 2.1-2.6, you will have:
- Complete `tugtool-core` crate with all shared infrastructure
- All core modules migrated: patch, facts, error, output, session, workspace, sandbox, text, diff, util
- Root `src/lib.rs` re-exports everything from `tugtool-core`
- Original module files in `src/` either deleted or converted to re-export wrappers
- **All 639 tests still passing** (critical!)
- Clean dependency boundaries

**Final Step 2 Checkpoint:**
- [x] `cargo nextest run --workspace` - **all 643 tests pass** (not just core tests!)
- [x] `cargo test -p tugtool-core` - core tests pass independently
- [x] `cargo clippy -p tugtool-core -- -D warnings` - no warnings
- [x] `cargo clippy -- -D warnings` - no warnings on root package
- [x] Core crate can be used as dependency (verify with `cargo doc -p tugtool-core`)
- [x] `tests/api_surface.rs` still compiles (API contract preserved)

---

### Step 3: Extract tugtool-python {#step-3}

#### Step 3.1: Create tugtool-python crate skeleton {#step-3-1}

**Commit:** `refactor(python): create tugtool-python crate with dependency on core`

**References:** [D03] Optional languages, Table T01, (#crate-boundaries)

**Artifacts:**
- `crates/tugtool-python/Cargo.toml` with core dependency
- Basic lib.rs structure

**Tasks:**
- [x] Configure Cargo.toml with tugtool-core dependency
- [x] Set up lib.rs module structure matching python/ layout
- [x] Verify crate compiles (empty modules)

**Python crate Cargo.toml:**
```toml
[package]
name = "tugtool-python"
version.workspace = true
edition.workspace = true
authors.workspace = true
license.workspace = true

[dependencies]
tugtool-core = { path = "../tugtool-core" }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
thiserror = "2.0"
tempfile = "3"
tracing = "0.1"

[lints]
workspace = true
```

**Checkpoint:**
- [x] `cargo check -p tugtool-python` succeeds

**Rollback:**
- `git checkout -- crates/tugtool-python/`

**Commit after all checkpoints pass.**

---

#### Step 3.2: Move Python modules to tugtool-python {#step-3-2}

**Commit:** `refactor(python): move all python modules to tugtool-python crate`

**References:** [D03] Optional languages, Table T03

**Artifacts:**
- All files from `src/python/` moved to `crates/tugtool-python/src/`
- Updated imports throughout
- Root `src/lib.rs` updated to re-export from `tugtool-python`
- Root `Cargo.toml` updated with `tugtool-python` dependency

**Tasks:**
- [ ] Add `tugtool-python` as a dependency in root `Cargo.toml`:
      ```toml
      [dependencies]
      tugtool-python = { path = "crates/tugtool-python" }
      ```
- [ ] Copy all files from `src/python/` to `crates/tugtool-python/src/`
- [ ] Update lib.rs in tugtool-python to export all public items
- [ ] Update imports: `use crate::` -> `use tugtool_core::`
- [ ] Fix any module path references
- [ ] Update root `src/lib.rs` to re-export: `pub use tugtool_python as python;`
- [ ] Delete `src/python/` directory (or convert mod.rs to re-export wrapper)
- [ ] Verify BOTH crates compile and all tests pass

**Import pattern changes:**
```rust
// Before (in python/ops/rename.rs)
use crate::facts::{FactsStore, ReferenceKind};
use crate::patch::{FileId, Span};

// After
use tugtool_core::facts::{FactsStore, ReferenceKind};
use tugtool_core::patch::{FileId, Span};
```

**Tests:**
- [ ] `cargo check -p tugtool-python`
- [ ] `cargo nextest run --workspace` - **all tests pass** (not just Python crate tests!)

**Checkpoint:**
- [ ] Python crate compiles
- [ ] `cargo nextest run --workspace` - **all tests still pass**
- [ ] `use tugtool::python::*` still works (API compatibility)

**Rollback:**
- `git checkout -- crates/tugtool-python/ src/python/ src/lib.rs Cargo.toml`

**Commit after all checkpoints pass.**

---

### Step 4: Create tugtool-rust placeholder {#step-4}

**Commit:** `refactor(rust): create tugtool-rust placeholder crate`

**References:** [D03] Optional languages, Table T01

**Artifacts:**
- `crates/tugtool-rust/Cargo.toml`
- `crates/tugtool-rust/src/lib.rs` with placeholder

**Tasks:**
- [x] Configure Cargo.toml with tugtool-core dependency
- [x] Create lib.rs with placeholder comment
- [x] Move `src/rust/mod.rs` content (if any) to lib.rs
- [x] Verify crate compiles

**Rust crate lib.rs:**
```rust
//! Rust language support for tugtool.
//!
//! This crate provides Rust-specific refactoring operations using rust-analyzer.
//!
//! **Status:** Placeholder - implementation planned for future phases.

use tugtool_core as _core;

/// Placeholder for Rust analyzer adapter.
pub struct RustAdapter;

impl RustAdapter {
    /// Create a new Rust adapter (placeholder).
    pub fn new() -> Self {
        RustAdapter
    }
}

impl Default for RustAdapter {
    fn default() -> Self {
        Self::new()
    }
}
```

**Checkpoint:**
- [x] `cargo check -p tugtool-rust` succeeds

**Rollback:**
- `git checkout -- crates/tugtool-rust/`

**Commit after all checkpoints pass.**

---

### Step 5: Refactor main tugtool crate {#step-5}

**CRITICAL: Transitioning the Binary**

This step moves CLI/MCP code to `crates/tugtool/`. At this point:
- Core infrastructure is in `tugtool-core`
- Python support is in `tugtool-python`
- The root still has `src/main.rs`, `src/cli.rs`, etc.

After this step:
- `crates/tugtool/` becomes the main binary crate
- Root `src/` only has re-export lib.rs (will be removed in Step 6)
- All tests still pass

**Important:** During this step, we temporarily have TWO places that can build the `tug` binary (root and `crates/tugtool`). This is resolved in Step 6 when we convert to virtual workspace.

#### Step 5.1: Move CLI files to main crate {#step-5-1}

**Commit:** `refactor: move main, cli, mcp, testcmd to tugtool crate`

**References:** [D04] Main crate, [D05] Testcmd, Table T03

**Artifacts:**
- `crates/tugtool/src/main.rs`
- `crates/tugtool/src/cli.rs`
- `crates/tugtool/src/mcp.rs`
- `crates/tugtool/src/testcmd.rs`
- Updated `crates/tugtool/Cargo.toml` with all dependencies

**Tasks:**
- [ ] Copy `src/main.rs` to `crates/tugtool/src/main.rs`
- [ ] Copy `src/cli.rs` to `crates/tugtool/src/cli.rs`
- [ ] Copy `src/mcp.rs` to `crates/tugtool/src/mcp.rs`
- [ ] Copy `src/testcmd.rs` to `crates/tugtool/src/testcmd.rs`
- [ ] Update `crates/tugtool/Cargo.toml` with dependencies and features (see below)
- [ ] Update imports in all moved files to use `tugtool_core::` and `tugtool_python::`
- [ ] Verify `crates/tugtool` compiles independently
- [ ] Verify root package still compiles (tests still run against root)

**Main crate Cargo.toml:**
```toml
[package]
name = "tugtool"
version.workspace = true
edition.workspace = true
authors.workspace = true
license.workspace = true
description = "AI-native code transformation engine for verified, deterministic refactors"
repository.workspace = true
readme = "../../README.md"
keywords = ["refactoring", "code-transformation", "ai", "mcp", "cli"]
categories = ["development-tools", "command-line-utilities"]

[[bin]]
name = "tug"
path = "src/main.rs"

[lib]
name = "tugtool"
path = "src/lib.rs"

[dependencies]
tugtool-core = { path = "../tugtool-core" }
tugtool-python = { path = "../tugtool-python", optional = true }
tugtool-rust = { path = "../tugtool-rust", optional = true }

# CLI
clap = { version = "4", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1", features = ["full"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }

# MCP (optional)
rmcp = { version = "0.12", features = ["server", "transport-io"], optional = true }
schemars = { version = "1", optional = true }

[features]
default = ["python", "mcp"]
python = ["dep:tugtool-python"]
rust = ["dep:tugtool-rust"]
mcp = ["dep:rmcp", "dep:schemars"]
full = ["python", "rust", "mcp"]

[lints]
workspace = true
```

**Tests:**
- [ ] `cargo check -p tugtool` (the crates/tugtool package)
- [ ] `cargo nextest run --workspace` - all tests still pass (against root package)

**Checkpoint:**
- [ ] `crates/tugtool` compiles: `cargo build -p tugtool`
- [ ] Root package still works: `cargo nextest run --workspace` - **all tests pass**
- [ ] Binary works from new location: `cargo run -p tugtool -- --help`

**Rollback:**
- `git checkout -- crates/tugtool/`

**Commit after all checkpoints pass.**

---

#### Step 5.2: Create lib.rs with re-exports {#step-5-2}

**Commit:** `refactor: add re-exports to tugtool lib.rs for API compatibility`

**References:** [D04] Main crate, Spec S02, (#public-api)

**Artifacts:**
- `crates/tugtool/src/lib.rs` with all re-exports

**Tasks:**
- [x] Create `crates/tugtool/src/lib.rs` with public re-exports from core
- [x] Add conditional re-exports for language crates
- [x] Add re-exports for cli, mcp, testcmd modules
- [x] Verify all previously-public types are accessible via `tugtool::*`
- [x] Update main.rs to use new module paths

**Checkpoint:**
- [x] `cargo check -p tugtool` (the crates/tugtool package)
- [x] `cargo doc -p tugtool` - documentation builds
- [x] `cargo nextest run --workspace` - **all tests still pass**

**Rollback:**
- `git checkout -- crates/tugtool/src/lib.rs`

**Commit after all checkpoints pass.**

---

#### Step 5.3: Update CLI imports and conditional compilation {#step-5-3}

**Commit:** `refactor: update CLI with conditional language support`

**References:** [D03] Optional languages, Table T02, Spec S01

**Artifacts:**
- Updated `cli.rs` with feature-gated language dispatch
- Updated `mcp.rs` with feature-gated tools

**Tasks:**
- [x] Add `#[cfg(feature = "python")]` guards to Python-specific CLI code
- [x] Add `#[cfg(feature = "rust")]` guards to Rust-specific CLI code
- [x] Update MCP tool registration with feature guards
- [x] Verify build with default features
- [x] Verify build with `--no-default-features`

**Conditional compilation pattern:**
```rust
// In cli.rs
#[cfg(feature = "python")]
use tugtool_python::ops::rename::PythonRenameOp;

pub fn run_rename(args: &RenameArgs) -> Result<(), TugError> {
    match args.language {
        #[cfg(feature = "python")]
        Language::Python => {
            // Python rename logic
        }
        #[cfg(feature = "rust")]
        Language::Rust => {
            // Rust rename logic (placeholder)
        }
        _ => {
            return Err(TugError::unsupported_language(args.language));
        }
    }
}
```

**Tests:**
- [x] `cargo build -p tugtool` (default features)
- [x] `cargo build -p tugtool --no-default-features`
- [x] `cargo build -p tugtool --features python`
- [x] `cargo build -p tugtool --features mcp` (MCP without Python - verifies no accidental Python imports)
- [x] `cargo build -p tugtool --features full`

**Checkpoint:**
- [x] All feature combinations compile (including `--features mcp` alone)
- [x] `cargo run -p tugtool -- --help` works
- [x] MCP-only build has no Python dependencies (verify with `--features mcp` compile)
  - Violation: any `use tugtool_python::` or dependency edge to `tugtool-python` without `#[cfg(feature = "python")]` guard

**Rollback:**
- `git checkout -- crates/tugtool/src/`

**Commit after all checkpoints pass.**

---

### Step 6: Clean up and finalize {#step-6}

**CRITICAL: This step converts from hybrid to virtual workspace**

At this point:
- All code has been migrated to `crates/`
- `crates/tugtool/` is the new main binary crate with all CLI/MCP code
- Root `src/` is no longer needed
- We can now safely convert to a virtual workspace

#### Step 6.1: Convert to virtual workspace and remove old src/ {#step-6-1}

**Commit:** `refactor: convert to virtual workspace, remove old src/`

**References:** [D07] Virtual workspace, Table T03, (#success-criteria)

**Artifacts:**
- Virtual workspace Cargo.toml (no `[package]` section)
- Old `src/` directory removed
- `crates/tugtool` added to workspace members

**Tasks:**
- [x] Add `"crates/tugtool"` to workspace members list
- [x] Remove `[package]` section from root Cargo.toml
- [x] Remove `[dependencies]` section from root Cargo.toml (dependencies are now in crates)
- [x] Remove `[[bin]]`, `[lib]`, `[features]` sections from root Cargo.toml
- [x] Delete `src/` directory entirely
- [x] Update `tests/` directory to use `crates/tugtool` as the test target (may need to move to `crates/tugtool/tests/`)
- [x] Update any hardcoded paths in tests

**Final root Cargo.toml (virtual workspace):**
```toml
[workspace]
resolver = "2"
members = [
    "crates/tugtool",        # NOW INCLUDED
    "crates/tugtool-core",
    "crates/tugtool-python",
    "crates/tugtool-rust",
]

[workspace.package]
version = "0.1.0"
edition = "2021"
authors = ["Ken Kocienda"]
license = "MIT"
repository = "https://github.com/tugtool/tugtool"

[workspace.lints.rust]
warnings = "deny"

[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
collapsible_if = "allow"

# NO [package] section - this is now a virtual workspace
# NO [dependencies] section - dependencies are in individual crates

[profile.release]
lto = "thin"
debug = "line-tables-only"

[profile.dev]
debug = 1
incremental = true
codegen-units = 256
lto = false
panic = "unwind"

[profile.test]
debug = 1
incremental = true
lto = false
```

**Checkpoint:**
- [x] `cargo build` succeeds from workspace root
- [x] `cargo nextest run --workspace` - all tests pass
- [x] `src/` directory no longer exists
- [x] Root Cargo.toml has NO `[package]` section

**Rollback:**
- `git checkout HEAD~1 -- src/ Cargo.toml`

**Commit after all checkpoints pass.**

---

#### Step 6.2: Update documentation and CI {#step-6-2}

**Commit:** `docs: update CLAUDE.md, README, and CI for workspace structure`

**References:** (#documentation-plan), [D07] Virtual workspace

**Artifacts:**
- Updated CLAUDE.md with new architecture section
- Updated README.md if needed
- Updated CI workflows for workspace commands
- Updated Justfile (if present)

**Tasks:**
- [x] Update CLAUDE.md Architecture section with new structure
- [x] Update build commands to reference workspace
- [x] Document feature flags
- [x] Update any path references
- [x] Update `.github/workflows/*.yml` to use `-p tugtool` or `--workspace` as appropriate
- [x] Update `Justfile` commands (if present) for workspace structure
- [x] Verify `cargo install --path crates/tugtool` works (document in README)

**CLAUDE.md updates:**
```markdown
## Architecture

tugtool is organized as a Cargo workspace with the following crates:

crates/
+-- tugtool/        # Main binary and CLI
+-- tugtool-core/   # Shared infrastructure
+-- tugtool-python/ # Python language support
+-- tugtool-rust/   # Rust language support (planned)

### Build Commands

# Build all crates
cargo build

# Build specific crate
cargo build -p tugtool-core

# Build with specific features
cargo build --no-default-features --features python
```

**Checkpoint:**
- [x] CLAUDE.md reflects new structure
- [x] `cargo doc --workspace` succeeds

**Rollback:**
- `git checkout -- CLAUDE.md README.md`

**Commit after all checkpoints pass.**

---

#### Step 6.3: Verify full test suite and metrics {#step-6-3}

**Commit:** `test: verify workspace migration maintains test coverage`

**References:** (#success-criteria)

**Artifacts:**
- Test report showing all tests pass
- Build timing comparison

**Tasks:**
- [x] Run full test suite: `cargo nextest run --workspace`
- [x] Run clippy: `cargo clippy --workspace -- -D warnings`
- [x] Run fmt: `cargo fmt --all --check`
- [x] Compare build times with baseline from Step 0
- [x] Verify `cargo install --path crates/tugtool` works

**Checkpoint:**
- [x] `cargo nextest run --workspace` - all tests pass (643 tests)
- [x] `cargo clippy --workspace -- -D warnings` - no warnings
- [x] `cargo fmt --all --check` - no formatting issues
- [x] Build times similar or improved vs baseline (~8s clean build)

**Rollback:** N/A (verification step)

**Commit after all checkpoints pass.**

---

## 2.0.6 Deliverables and Checkpoints {#deliverables}

**Deliverable:** Tugtool restructured as Cargo workspace with 4 member crates (tugtool, tugtool-core, tugtool-python, tugtool-rust), feature flags for language selection, and preserved API compatibility.

### Phase Exit Criteria ("Done means...") {#exit-criteria}

- [x] Root Cargo.toml is a **virtual workspace** (no `[package]` section)
- [x] `src/` directory no longer exists
- [x] All 4 crates compile independently (`cargo check -p <crate>`)
- [x] Full test suite passes (`cargo nextest run --workspace`)
- [x] `cargo build -p tugtool --no-default-features` produces working binary
- [x] `cargo build -p tugtool --features python` includes Python support
- [x] CLAUDE.md updated with new structure
- [x] CLI and JSON output unchanged from pre-migration behavior
- [x] `tests/api_surface.rs` compiles (public API contract preserved)

**Acceptance tests:**
- [x] Integration test: Full rename operation works end-to-end
- [x] Integration test: MCP server starts and responds to tool calls
- [x] Golden test: JSON output schemas unchanged
- [x] API surface test: All public re-exports accessible

**CRITICAL: Test count verification**
- [x] Final test count: 643 tests (exceeds baseline of 639)

### Milestones (Within Phase) {#milestones}

**Milestone M00: Hybrid workspace established (Step 1)** {#m00-hybrid-workspace}
- [x] Root Cargo.toml has BOTH `[workspace]` AND `[package]` sections
- [x] All 639 tests still pass
- [x] Empty crate skeletons exist in `crates/`

**Milestone M01: Core crate complete (Step 2)** {#m01-core-complete}
- [x] tugtool-core contains all shared infrastructure
- [x] Root `src/lib.rs` re-exports from tugtool-core
- [x] **All 643 tests still pass** (critical!)
- [x] Core crate tests pass independently

**Milestone M02: Python crate complete (Step 3)** {#m02-python-complete}
- [x] tugtool-python contains all Python support
- [x] Root `src/lib.rs` re-exports from tugtool-python
- [x] **All 643 tests still pass** (critical!)
- [x] Python crate tests pass independently (188 tests)

**Milestone M03: Workspace integrated (Step 5)** {#m03-workspace-integrated}
- [x] Main tugtool crate in `crates/tugtool/` composes all pieces
- [x] Feature flags work correctly
- [x] **All 643 tests still pass** (critical!)

**Milestone M04: Virtual workspace complete (Step 6)** {#m04-virtual-workspace}
- [x] Root Cargo.toml has NO `[package]` section
- [x] `src/` directory removed
- [x] **All 643 tests still pass** (critical!)

### Roadmap / Follow-ons (Explicitly Not Required for Phase Close) {#roadmap}

- [ ] Implement actual Rust language support in tugtool-rust
- [ ] Consider extracting MCP to separate crate if build times warrant
- [ ] Add per-crate CI jobs for parallel testing
- [ ] Investigate dynamic plugin loading for languages

| Checkpoint | Verification |
|------------|--------------|
| Virtual workspace | Root Cargo.toml has no `[package]` section |
| src/ removed | `! -d src` (directory does not exist) |
| Workspace compiles | `cargo build --workspace` |
| All tests pass | `cargo nextest run --workspace` (must show 643 tests) |
| Features work | `cargo build -p tugtool --no-default-features --features python` |
| No regressions | Compare test counts and build times with baseline |
| API preserved | `tests/api_surface.rs` compiles |

**CRITICAL: If test count drops below baseline at any step, STOP and investigate before proceeding.**

**Commit after all checkpoints pass.**

Real Programmers

January 17, 2026 at 10:32 AM

AI coding assistants like Claude Code and Cursor have changed the way I work. My daily programming today looks nothing like it did even a couple years ago. Today, I hardly ever write individual lines of code. AI coding assistants has relieved me of this. It’s better at it than I am. I’m OK with that.

This is a big change for me, because over my thirty-year career, writing individual lines of code has my main pursuit, taking up most of my time and attention. It was the only way to build up the functions, methods, libraries, frameworks, and apps that delivered on my actual goal: producing great software that is useful and meaningful to people.

Now that I can delegate a lot of this work AI coding assistants, and that means I can focus more on thinking about exactly what I want to make, rather than tediously and laboriously trying to achieve my desired effects. I now spend more time thinking about the edifice as a whole, rather than on building it up brick by brick.

I love this. It’s freeing.

I know others don’t agree. Judging from what I see and read on the internet, many experienced software developers seem to think AI has nothing to offer—that it’s a fad, a trap, an option only for the lazy or uninitiated. Whatever. It’s not how real programming gets done.

This reminds me of The Story of Mel, a Real Programmer. As the story goes, there once was a programmer, named Mel. Mel had a computer with a drum memory, and it was possible to optimize every instruction the machine executed. He wrote his code in machine language.

Raw, unadorned, inscrutable hexadecimal numbers. Directly.

Mel reveled in this kind of work. He knew all the minutiae of how his machine functioned. He could calculate the speed of the rotating drum storing his machine instructions, and he could time the read head to pass over the address of his next instruction at precisely the right moment for optimal execution speed. He scoffed at programmers who were starting to use a new-fangled software aid called an optimizing assembler, a tool to assign these addresses to the machine instructions automatically. He refused to use it.

Mel said, “You never know where it’s going to put things”.

Plus ça change…

Today, optimizing assembers lie several levels beneath the notice of contemporary real programmers. Over time, we have simply come to accept the loss of detail Mel thought was essential to proper work—since it wasn’t actually essential to the task. It was merely essential to Mel’s view of himself as a programmer.

Who knows what happened to Mel when the exact details of his machine changed out from under him? It didn’t take long, for sure. That change came fast. What didn’t change was that the actual goal for Mel was the same then as it is today: making great software. The same kind of change that Mel ridiculed is—but surely fell victim to over time—is happening right now. Only the bugaboo isn’t optimizing assemblers; it’s AI.

Yes, AI coding assistants are non-deterministic. Sure, they easily get off track. Indeed, “You never know where it’s going to put things.”

Yet, we’ve crossed over into a new era anyway. Change is coming to all software development. AI is a box of magic tools, and today, I think the job of a real programmer is to figure out how to make the best use of this new potential. Build new processes. Make new tools. Integrate AI into a new working method. Succeed at this, and help to define a better future for how we create software. Ignore the new potential, and run the risk of being left behind.

The Site Meta-Project

January 16, 2026 at 1:24 PM

Starting a software project involves so many things that have to get done before getting down to the actual work of developing the code itself: defining the scope and audience, picking a development language, coming up with good names… and putting up a website.

My main gripe with contemporary web tech is that the state of the art is so complicated. Following the trends means piling up layers and layers of libraries and modules. Mounds of javascript for everything. Complex deployment flows. Nothing is elegant.

When I sat down to make the site you’re reading right now, I did what I typically do these days: I asked an LLM for help. Collaborating with AI to do tech work is a major part of the motivation for this project. My aim is to find out what AI is actually good for, figure what gaps there are to fill, and then decide if there’s something I can do about filling them. More on that in future posts.

For this site, I asked GPT-5.2 to help me come up with a plan for a simple website. A landing page, top nav, link to github, a blog. It wound up producing a plan for Astro+Keystatic+Coolify+Hetzner+GitHub Apps+Docker+CI+Webhooks… and more! It seemed like a lot of tech for a little job, but I decided to follow the steps I got. I hoped I might learn something along the way.

What I actually learned was: bad idea. I didn’t think enough myself. I put myself in the hands of the LLM and its advice without knowing completely what I was going to get when everything was done. So, I got bunch of services that didn’t really talk to each other in a way I liked. I threw it away. A day of work wasted.

That was yesterday. I went to sleep. I woke up at 2:30 AM with an idea for doing better. I sat on the couch for an hour or so and sketched it out. I then worked on it couple more hours after morning coffee. Between me and Claude/Opus and GPT 5.2, I came up with a plan, with this as the introduction.

Static Site + Blog Infrastructure Plan

Astro + MDX · rsync deploy · Linode · Cloudflare DNS

This document defines a boring, sane, long-lived infrastructure for a content-driven website and blog.

The guiding principle is simple:

Build locally. Inspect everything. Deploy files. Serve static content. No CI rebuilds. No Docker. No GitHub in the deploy path. No surprises.

I then banged out the work itself in about two hours. You’re looking at the result now. The site generates static pages with a single deploy command that takes my local files (which must not have any uncommitted git changes), compiles them down, and then pushes that to a site on linode. I can view posts in the browser as I am writing them by hitting save and have the server hot-reload the page. When I deploy, I get exactly on the deployed site what I saw locally. As it says, No surprises.

It feels like this is the way that blogs should work. Live editing and previews. WYSIWYG in its way, with local and deployed versions of the site the same.

It’s about as elegant a solution as I could think of, with the state of web tech today, and without shifting my focus (for too long) away from the real goal of this project, which is to figure out some software to help AIs with code refactoring.

TugColor Palette System

Overview

Basic system: 60 named colors

48 chromatic colors

Canonical lightness

11 achromatic colors

1 transparent

Extended system: 176 named colors

Chromatic adjacency (circular ring)

Achromatic adjacency (linear sequence)

Adjacency rules

Presets compose with adjacency

Color counts

Files:

The Tug Multi-Agent Coordination Problem

Phases and Implications

Phases and Skills

Phases and Requirements

Skill <=> Subagent Communication

Problems in the Current Code

Beads and Skill <=> Subagent Communication

Next Steps

Conversation

Diagnosis: What’s Actually Happening

Proposal: Replace Beads with Tugstate (v1)

The Concurrency Problem

Revised Proposal: Embedded SQLite

Merge: How do we avoid the mess?

Why SQLite, not Dolt?

Footnotes

tl;dr

Agent Orchestration Loops

Footnotes

Table of Contents

Workflow Overview

Definitions

Phase of Work

Plan File

”Investigate” Prompt

Planning Loop

[P1] Initiate a New Phase

[P2] Create the Plan File

[P3] Review with Claude Opus

[P4] Use code-planner for Major Rewrites

[P5] Cross-Reference with GPT-5.2 in Cursor

[P6] Iterate Until Ready

Implementation Loop

[I7] Assess Each Plan Step

[I8] Run /implement-plan

[I9] Monitor Implementation

[I10] Handle Struggles

[I11] Review with plan-step-reviewer

[I12] Update the Implementation Log

[I13] Prepare Commit Message

[I14] Commit and Continue

Key Design Principles

Multiple AI Perspectives

Iterative Refinement

Separation of Concerns

Plan as Single Source of Truth

Implementation Log as History

Files Reference

Plan Files

Agents

Commands

Generated Files

Footnotes

GitHub Links:

A built-out example of a planning file:

Files: