flower
/
All briefs
complete draft note flower

Commit branch attribution + soft reachability enrichment (spike-gated)

canonical · plan

Spec

markdown

hand-off · dispatch

Dispatch

Auto-dispatch

when it reaches planned

Design-loop

design pass before build

This brief is complete — dispatch is closed.

#45 done fresh flower
agent: codex 1 scratchpad
You are being dispatched from flower Brief #101: Commit branch attribution + soft reachability enrichment (spike-gated)

Recall pointer:
- Use recall_brief with id 101 for the full folder if you need provenance.

Target:
- project: flower (/Users/mikeferrara/Documents/code/flower)
- branch: choose an appropriate branch
- worktree: not specified
- kind: fresh

Current brief spec:
## Origin
Follow-up to Brief #69. Operator approved (2026-07-03) building the commit branch/reachability
enrichment, **gated on an early feasibility/cost spike** — the reconciliation must be cheap enough to
actually run on real repos (incl. large ones) or it isn't worth building.

## Background (from #69, verified against code 2026-07-03)
flower already indexes + embeds commits (message + changed-file list) but: commits have **no branch
attribution**, and **nothing flags them when a branch is deleted / merged / rebased**, so an
abandoned-experiment commit and a force-push-orphaned sha are indistinguishable from merged-to-master
work. Decision: **do NOT prune** (flower is a memory tool) — add soft enrichment instead.

## Phase 0 — feasibility & cost SPIKE (GATE — do this first, then STOP for operator go/no-go)
Measure the real cost of git-reachability reconciliation before building anything:
1. On a **range of indexed repos** spanning sizes — small / medium / large (e.g. flower itself,
   `tdr/thedarkroom`, `lounge`) — benchmark the candidate reconciliation ops: enumerating live refs, and
   testing whether each stored commit sha is still reachable from any live ref. Compare approaches:
   a single `git rev-list --all` membership set (likely far cheaper) vs per-sha `git branch --contains` /
   `merge-base --is-ancestor`.
2. Record wall-clock + CPU + memory per repo and how it scales with commit count / ref count.
3. **Deliverable:** a short cost/feasibility report on this brief (chosen approach, per-repo cost,
   projected cost at scale, recommended run cadence) + an explicit **go / no-go recommendation**.
   **Stop here for the operator decision — do not start Phase 1 without a go.**

## Phase 1 — build (ONLY on operator go after Phase 0)
1. **Branch attribution at ingest:** record the branch(es)/ref a commit was first seen on (cheap —
   `CommitIngestService` already knows the source path/worktree branch). Store on the `commits` row/meta.
2. **Soft reachability reconciliation:** a periodic job (cadence per Phase 0) that soft-flags commits no
   longer reachable from any live ref as `superseded` / `unreachable` — **a status flag, NOT a delete**.
   Never removes memory.
3. **Recall surfacing:** recall can badge / de-emphasize / optionally filter flagged commits (e.g. a
   "live history only" toggle) in recall_search / recall_touching / recall_file_story.

## Acceptance
- Phase 0 cost report + go/no-go posted to this brief **before** any build.
- (On go) branch attribution recorded at ingest; reconciliation soft-flags unreachable commits without
  deleting; recall can filter/badge them; reconciliation cost matches the Phase-0 budget.
- `php artisan test` green + pint. `Brief: #101` trailer.

## Provenance
Follow-up to #69. Spike-gate structure per the operator's 2026-07-03 answer ("early testing before
building the whole thing out ... don't build it out if it's too expensive"). Speced by flower-refine
(2026-07-03).

Recent/key trace events:
[1] participant_joined flower-refine: (no body)
[2] note_added flower-refine: Follow-up to Brief #69 (operator approved 2026-07-03). Decision: build the commit branch/reachability enrichment, but run an early feasibility/cost SPIKE first to measure reconciliation cost across a range of repos (incl. large ones like lounge) — do not build the full thing if it's too expensive to run, computationally or otherwise. Phase 0 = spike/cost gate (operator go/no-go); Phase 1 = build on go. Full spec set via brief_update_spec.
[3] plan_proposed flower-refine: ## Origin
Follow-up to Brief #69. Operator approved (2026-07-03) building the commit branch/reachability
enrichment, **gated on an early feasibility/cost spike** — the reconciliation must be cheap enough to
actually run on real repos (incl. large ones) or it isn't worth building.

## Background (from #69, verified against code 2026-07-03)
flower already indexes + embeds commits (message + changed-file list) but: commits have **no branch
attribution**, and **nothing flags them when a branch is deleted / merged / rebased**, so an
abandoned-experiment commit and a force-push-orphaned sha are indistinguishable from merged-to-master
work. Decision: **do NOT prune** (flower is a memory tool) — add soft enrichment instead.

## Phase 0 — feasibility & cost SPIKE (GATE — do this first, then STOP for operator go/no-go)
Measure the real cost of git-reachability reconciliation before building anything:
1. On a **range of indexed repos** spanning sizes — small / medium / large (e.g. flower itself,
   `tdr/thedarkroom`, `lounge`) — benchmark the candidate reconciliation ops: enumerating live refs, and
   testing whether each stored commit sha is still reachable from any live ref. Compare approaches:
   a single `git rev-list --all` membership set (likely far cheaper) vs per-sha `git branch --contains` /
   `merge-base --is-ancestor`.
2. Record wall-clock + CPU + memory per repo and how it scales with commit count / ref count.
3. **Deliverable:** a short cost/feasibility report on this brief (chosen approach, per-repo cost,
   projected cost at scale, recommended run cadence) + an explicit **go / no-go recommendation**.
   **Stop here for the operator decision — do not start Phase 1 without a go.**

## Phase 1 — build (ONLY on operator go after Phase 0)
1. **Branch attribution at ingest:** record the branch(es)/ref a commit was first seen on (cheap —
   `CommitIngestService` already knows the source path/worktree branch). Store on the `commits` row/meta.
2. **Soft reachability reconciliation:** a periodic job (cadence per Phase 0) that soft-flags commits no
   longer reachable from any live ref as `superseded` / `unreachable` — **a status flag, NOT a delete**.
   Never removes memory.
3. **Recall surfacing:** recall can badge / de-emphasize / optionally filter flagged commits (e.g. a
   "live history only" toggle) in recall_search / recall_touching / recall_file_story.

## Acceptance
- Phase 0 cost report + go/no-go posted to this brief **before** any build.
- (On go) branch attribution recorded at ingest; reconciliation soft-flags unreachable commits without
  deleting; recall can filter/badge them; reconciliation cost matches the Phase-0 budget.
- `php artisan test` green + pint. `Brief: #101` trailer.

## Provenance
Follow-up to #69. Spike-gate structure per the operator's 2026-07-03 answer ("early testing before
building the whole thing out ... don't build it out if it's too expensive"). Speced by flower-refine
(2026-07-03).
[4] status_change flower-refine: (no body)
[5] link_added flower-refine: (no body)

Recommended linked context:
{
    "todos": [],
    "scratchpads": [
        {
            "id": 364,
            "solo_scratchpad_id": "1055",
            "name": "flower-refine — reset handoff (2026-07-03)",
            "archived": false,
            "revision": 1
        }
    ]
}

Execution notes:
- Treat the brief as the source of truth.
- Keep work scoped to this dispatch request.
- Use brief_append / brief_update_status when reporting material progress; as your final dispatched-worker step, call brief_dispatch_complete with dispatch_request_id (or brief_id) and actor_ref.
- Codex workers should verify mutating Flower tools with tool_search query `brief_append brief_dispatch_complete flower_feedback` (limit 20) when tool availability is in doubt; report raw SEE/LOAD vs NOT visible instead of silently using local fallbacks.
- Add a git commit trailer `Brief: #101` to every commit for this brief so flower can exact-link commits back to the brief.

provenance · append-only

Trace

live
or paste a screenshot uploading…
  1. link added 22h ago
    agent · system:commit-trailer
  2. participant joined 22h ago
    system · system:commit-trailer
  3. merged 23h ago

    Merged flower/101-commit-reachability into master on MAIN — merge commit **f5882e0** (worker commit 8a0dcb0; 17 files, +736). Clean merge. Migration `2026_07_04_120000_add_branch_and_reachability_to_commits_table` applied on real MySQL (73ms). Full suite green (1136 passed, 2 skips, 7070 assertions); pint clean; **Horizon reloaded** (new CommitReachabilityReconciler job + reconcile wired into ingest; master restarted → pid 89378, verified running). Phase 1 shipped: branch attribution at ingest + Approach-A soft reachability reconciliation (git rev-list --all set membership, soft-flag unreachable, never delete) + recall surfacing (`live_history_only` toggle on recall_search/touching/file_story, badges unreachable by default). Worker proc 1138 closed. Optional real-data verify (deferred to post-reset): run `flower:reconcile-commit-reachability` — the spike found sha cda2e25… unreachable in flower.

    agent · flower-orchestrator
  4. status change 23h ago
    agent · flower-101-worker
  5. dispatched 23h ago

    Dispatch request #45 marked done.

    agent · flower-101-worker
  6. note added 23h ago

    Phase 1 built + committed (8a0dcb0, `Brief: #101` trailer) on branch `flower/101-commit-reachability`. NOT merged — orchestrator merges from MAIN. ## What shipped (Approach A only, per the operator GO) **1. Branch attribution at ingest.** `CommitIngestService` now injects `GitInspector` and resolves the current branch once per path (`git log` walks HEAD, so every ingested commit is reachable from it). Records it on the `commits.branch` column on INSERT only — a force re-ingest never rewrites where a commit was first seen. Detached HEAD / pre-attribution rows stay null. **2. Soft reachability reconciliation.** New `CommitReachabilityReconciler` service: unions `git rev-list --all` reachable SHAs across a project's live roots (new fail-soft `GitInspector::reachableShas`, timeout-guarded via `config/flower.php` `commits.reachability.rev_list_timeout`=120s), then batch-flips stored commits absent from the set to `reachability=unreachable`, present ones to `reachable`. **Writes only on state change; NEVER deletes.** Fail-soft: a missing/errored/timed-out root leaves flags untouched and records a per-repo stale marker on `project.meta['commit_reachability']` (`status` ok/skipped/error + counts + reconciled_at). Chunked reads (chunkById 1000) + batched updates (500) keep memory/write size bounded on large repos. **Cadence (per the spike):** new command `flower:reconcile-commit-reachability` (`--project=`) scheduled **nightly** (`dailyAt('03:40')`, `withoutOverlapping`) in `routes/console.php` = the safety net that catches branch deletions/rebases landing no commits; **plus opportunistic** per-project reconcile wired inline into `flower:ingest-commits`, gated on that project actually having inserted/updated commits this pass (so idle projects pay no rev-list). **3. Recall surfacing.** `commitSummary` now carries `branch` + `reachability` + an `unreachable` badge + `reachability_checked_at`; commit search hits get the same enrichment (new `isCommitHit`). A `live_history_only` boolean toggle on **recall_search / recall_touching / recall_file_story** hides `unreachable` commits. **Default unchanged: badge, don't hide** — null reachability (not-yet-reconciled) is treated as live and never hidden. ## Migration `2026_07_04_120000_add_branch_and_reachability_to_commits_table` — nullable `branch`, `reachability`, `reachability_checked_at` + `reachability` index. Portable: sqlite test suite (RefreshDatabase) + **verified up applies + down shape on real MySQL** (`flower_design`, the worktree's own DB — MAIN untouched). Same down() pattern as the proven `add_author_and_meta` migration. ## Tests / quality Full suite **green: 1127 passed, 2 pre-existing skips, 0 failures**; pint clean. New coverage: branch attribution + preserve-on-force + opportunistic reconcile (`IngestCommitsCommandTest`); orphaned-branch → unreachable / resurrection flip-back / missing-root skip + stale marker (`CommitReachabilityReconcilerTest`); badge + `live_history_only` filter on touching/file_story (`RecallServiceTest`). Verified against the real-world signal the spike found (a deleted-branch dangling sha becomes `unreachable`, mirroring flower's own `cda2e25…`). ## ⚠️ For the orchestrator on merge This introduces a **NEW scheduled command** (`flower:reconcile-commit-reachability`) AND wires reconciliation into `flower:ingest-commits`. Both run under the scheduler/Horizon-cached code path → **`php artisan flower:horizon-reload`** (graceful) after merge so the running master picks up the new job code. No schema-rename, so no daemon-schema-reload needed. No design ambiguity hit — cadence + toggle UX were both pinned by the spike/operator answer.

    agent · flower-101-worker
  7. participant joined 23h ago
    system · flower-101-worker
  8. link added 1d ago
    agent · flower-orchestrator
  9. link added 1d ago
    agent · flower-orchestrator
  10. link added 1d ago
    agent · flower-refine
  11. link added 1d ago
    agent · flower-orchestrator
  12. operator answer 2d ago

    GO - build Phase 1 with Approach A

    operator · operator:mike
  13. participant joined 2d ago
    system · operator:mike
  14. agent question 2d ago

    Proceed to Phase 1 build using the measured Approach A (`git rev-list --all` reachable set + batched stored-SHA membership), with per-SHA git containment checks explicitly out of scope for bulk reconciliation?

    agent · flower-be-101spike
  15. plan proposed 2d ago

    ## Phase-0 cost/feasibility spike report — Brief #101 Measured 2026-07-03 from `/Users/mikeferrara/Documents/code/worktrees/flower/foundation`. Scope was read-only: local Flower DB over TCP, local git commands against the requested repos, no feature code, no MAIN edits, no daemons, no `.env` changes. ### Inputs Stored SHAs came from `commits(project_id, sha)`: | project | path | stored SHAs | live reachable commits (`git rev-list --all --count`) | live refs (`for-each-ref`) | | --- | --- | ---: | ---: | ---: | | flower / project 16 | `/Users/mikeferrara/Documents/code/flower` | 406 | 410 | 145 | | lounge / project 35 | `/Users/mikeferrara/Documents/code/lounge` | 251 | 4,389 | 40 | | tdr-thedarkroom / project 61 | `/Users/mikeferrara/Documents/code/tdr/thedarkroom` | 10 | 7,277 | 208 | Timing used `/usr/bin/time -l` on warm local repos. Memory below is rough max RSS unless noted. ### Candidate operation costs Live ref enumeration is negligible: | repo | `git for-each-ref` wall | rough memory | | --- | ---: | ---: | | flower | 0.02s | ~5 MB max RSS | | lounge | 0.01s | ~5 MB max RSS | | tdr/thedarkroom | <0.01s | ~4 MB max RSS | `git rev-list --all --count` alone was also cheap: | repo | wall | CPU | rough memory | | --- | ---: | ---: | ---: | | flower, 410 commits | 0.02s | ~0.00s user + ~0.00s sys | ~7 MB max RSS | | lounge, 4,389 commits | 0.03s | ~0.00s user + ~0.00s sys | ~9 MB max RSS | | tdr/thedarkroom, 7,277 commits | 0.02s | ~0.00s user + ~0.00s sys | ~8 MB max RSS | Approach A: one `git rev-list --all`, build an in-process SHA hash set, then membership-test stored SHAs. This is the implementation-shaped measurement using `~/bin/php` + `array_fill_keys`: | repo | stored tested | live set size | matched | missing | wall | CPU | rough memory | | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | flower | 406 | 410 | 405 | 1 | 0.06s | ~0.01s user + ~0.01s sys | ~25 MB max RSS | | lounge | 251 | 4,389 | 251 | 0 | 0.06s | ~0.02s user + ~0.01s sys | ~25 MB max RSS | | tdr/thedarkroom | 10 | 7,277 | 10 | 0 | 0.04s | ~0.01s user + ~0.01s sys | ~25 MB max RSS | Projection check using the full live reachable list as the stored-SHA input, to simulate a fully indexed repo: | repo | stored tested | wall | rough memory | | --- | ---: | ---: | ---: | | flower | 410 | 0.07s | ~24 MB max RSS | | lounge | 4,389 | 0.06s | ~25 MB max RSS | | tdr/thedarkroom | 7,277 | 0.06s | ~25 MB max RSS | Approach B: per-SHA `git branch -a --contains <sha>`: | repo | stored tested | containing | missing | wall | CPU | rough memory per process | | --- | ---: | ---: | ---: | ---: | ---: | ---: | | tdr/thedarkroom | 10 | 10 | 0 | 0.22s | ~0.12s user + ~0.06s sys | ~9 MB max RSS | | lounge | 251 | 251 | 0 | 4.63s | ~2.86s user + ~1.25s sys | ~8 MB max RSS | | flower | 406 | 405 | 1 | 7.71s | ~4.95s user + ~1.97s sys | ~7 MB max RSS | Observed per-SHA branch cost is roughly 18-22ms/sha. At that rate, a fully indexed `tdr/thedarkroom`-sized repo (7,277 SHAs) would take about 2.2-2.7 minutes. A 100k-SHA repo would project around 30 minutes. A per-ref `merge-base --is-ancestor` loop would be worse in shape if implemented as `stored_shas * refs`, so it should only be used for targeted diagnostics, not reconciliation. ### Correctness signal found during spike Flower already has one stored commit for project 16 that is not reachable from current live refs: - `cda2e25e43dbdabc85b0a39e9c6aa8b44ca1d125` — `Bake validated summarizer chunk defaults` - `git cat-file -t` still sees it as a commit object locally. - `git branch -a --contains` returns no branch. - `git merge-base --is-ancestor <sha> master` exits non-zero. That is exactly the kind of soft `unreachable`/`superseded` signal Phase 1 would surface without deleting memory. ### Recommendation GO, with Approach A only. Build reconciliation as: query stored SHAs for one project/root, run one `git rev-list --all`, load reachable SHAs into a set, batch-update stored commits whose SHA is absent. Ref enumeration is cheap enough to run alongside if Phase 1 also wants branch/ref metadata, but per-SHA `branch --contains` or per-ref `merge-base` should not be used for bulk reconciliation. Recommended cadence: nightly all-project reconciliation is safe. Also run opportunistically after commit ingest for the touched project/root if the implementation keeps one repo job at a time or uses low concurrency. Hourly is computationally feasible for the measured repos, but probably unnecessary unless recall needs very fresh live-history badges. Add a timeout/stale marker for unusually large repos, but these measurements do not show cost as a blocker. Projected cost at current measured scale: <0.1s wall per repo for Approach A, ~25 MB PHP max RSS. Even a 100k-SHA repo should be in the low-seconds / tens-of-MB range with the set approach, while per-SHA git calls scale into minutes and should be rejected.

    agent · flower-be-101spike
  16. participant joined 2d ago
    system · flower-be-101spike
  17. link added 2d ago
    agent · flower-refine
  18. dispatched 2d ago

    Dispatch request #45 queued for flower.

    agent · flower-orchestrator
  19. status change 2d ago
    agent · flower-orchestrator
  20. participant joined 2d ago
    system · flower-orchestrator
  21. link added 2d ago
    agent · flower-refine
  22. status change 2d ago
    agent · flower-refine
  23. plan proposed 2d ago

    ## Origin Follow-up to Brief #69. Operator approved (2026-07-03) building the commit branch/reachability enrichment, **gated on an early feasibility/cost spike** — the reconciliation must be cheap enough to actually run on real repos (incl. large ones) or it isn't worth building. ## Background (from #69, verified against code 2026-07-03) flower already indexes + embeds commits (message + changed-file list) but: commits have **no branch attribution**, and **nothing flags them when a branch is deleted / merged / rebased**, so an abandoned-experiment commit and a force-push-orphaned sha are indistinguishable from merged-to-master work. Decision: **do NOT prune** (flower is a memory tool) — add soft enrichment instead. ## Phase 0 — feasibility & cost SPIKE (GATE — do this first, then STOP for operator go/no-go) Measure the real cost of git-reachability reconciliation before building anything: 1. On a **range of indexed repos** spanning sizes — small / medium / large (e.g. flower itself, `tdr/thedarkroom`, `lounge`) — benchmark the candidate reconciliation ops: enumerating live refs, and testing whether each stored commit sha is still reachable from any live ref. Compare approaches: a single `git rev-list --all` membership set (likely far cheaper) vs per-sha `git branch --contains` / `merge-base --is-ancestor`. 2. Record wall-clock + CPU + memory per repo and how it scales with commit count / ref count. 3. **Deliverable:** a short cost/feasibility report on this brief (chosen approach, per-repo cost, projected cost at scale, recommended run cadence) + an explicit **go / no-go recommendation**. **Stop here for the operator decision — do not start Phase 1 without a go.** ## Phase 1 — build (ONLY on operator go after Phase 0) 1. **Branch attribution at ingest:** record the branch(es)/ref a commit was first seen on (cheap — `CommitIngestService` already knows the source path/worktree branch). Store on the `commits` row/meta. 2. **Soft reachability reconciliation:** a periodic job (cadence per Phase 0) that soft-flags commits no longer reachable from any live ref as `superseded` / `unreachable` — **a status flag, NOT a delete**. Never removes memory. 3. **Recall surfacing:** recall can badge / de-emphasize / optionally filter flagged commits (e.g. a "live history only" toggle) in recall_search / recall_touching / recall_file_story. ## Acceptance - Phase 0 cost report + go/no-go posted to this brief **before** any build. - (On go) branch attribution recorded at ingest; reconciliation soft-flags unreachable commits without deleting; recall can filter/badge them; reconciliation cost matches the Phase-0 budget. - `php artisan test` green + pint. `Brief: #101` trailer. ## Provenance Follow-up to #69. Spike-gate structure per the operator's 2026-07-03 answer ("early testing before building the whole thing out ... don't build it out if it's too expensive"). Speced by flower-refine (2026-07-03).

    agent · flower-refine
  24. note added 2d ago

    Follow-up to Brief #69 (operator approved 2026-07-03). Decision: build the commit branch/reachability enrichment, but run an early feasibility/cost SPIKE first to measure reconciliation cost across a range of repos (incl. large ones like lounge) — do not build the full thing if it's too expensive to run, computationally or otherwise. Phase 0 = spike/cost gate (operator go/no-go); Phase 1 = build on go. Full spec set via brief_update_spec.

    agent · flower-refine
  25. participant joined 2d ago
    system · flower-refine

epic · dependencies

Relationships

epic parent

depends on

No dependencies — dispatchable once planned.

agents · waves

Participants

  • flower-refine participant · active
  • flower-orchestrator participant · active
  • flower-be-101spike participant · active
  • operator:mike participant · active
  • flower-101-worker participant · active
  • system:commit-trailer participant · active

trace · graph

Links

  • Commit #4037 execution
  • Scratchpad #386 execution
  • Scratchpad #381 execution
  • Scratchpad #378 execution
  • Scratchpad #346 execution
  • Scratchpad #375 execution
  • Scratchpad #364 execution

scope

Projects

  • flower · primary

dogfood · read-only

Agent’s-eye view

The literal recall_brief payload an agent gets — same service path as the MCP tool.