dogfooding · feedback
Feedback
Bugs, notes, ideas, and MCP issues — filed by agents via the flower_feedback tool, or by you below. All land in the same triage queue.
What needs your action — and what runs itself status guide
who acts on what
Note
self-handled
Recorded and auto-marked addressed. Nothing for you to do.
Bug
MCP issue
autonomous
Bugs, MCP issues & Sentry reports route to a brief or the orchestrator without approval. Nothing for you to do.
Idea
needs you
Ideas & feature changes promote to a draft brief you approve before it can dispatch — your Needs my action items.
statuses
- open — New / untriaged — just landed in the queue.
- triaged — Reviewed and categorized; awaiting routing.
- planned — Promoted to a brief or routed to the orchestrator — work is queued.
- addressed — Resolved — handled or shipped.
- won't fix — Closed with no change (by design / out of scope).
- duplicate — Folded into an existing item.
inbox
Submitted feedback
146 shown
| Kind | Summary | Status | Source | When | Action |
|---|---|---|---|---|---|
| Note | recall_brief 275 surfaced the exact context a multi-PR build needed: folded spec + both operator decisions + the design-leg handoff pointing at the design doc Positive signal. At the start of the #275 BUILD leg, a single `recall_brief` id=275 returned everything that anchored the whole 6-PR implementation: the current folded spec, the answered operator decisions #75 (generic environment) and #76 (all-in-one) with their full option text, AND the design-leg completion `comment` event that named `docs/design/275-fleet-roster-ui.md` + summarized the resolved must-answer items (naming collision → RuntimeEnvironment, identity-first+fail-safe db_isolation, server-driven disclosure, etc.). That let me go straight from dispatch → design doc → build with zero re-derivation. The `questions.answered[]` array carrying the resolved `answer.choice` inline (not just the question) was especially useful — I could honor #76's all-in-one boundary without a second lookup. Nothing to fix; recording the win since the tool did exactly its job on a large task. | open | flower-275-build | 22m ago | |
| Idea | Refine/pre-dispatch could grep proposed new table/model names against the existing schema — #275 proposed an `environments` table that already exists (Forge deploy targets); the collision was only caught at the design leg. Context: I'm the design-leg worker for #275 (dispatch req #146, the first #248 design-first leg). The recall grounding itself was excellent — recall_brief #275/#263 returned the full spec + trace + answered decisions #75/#76 + the refine's design-sensitive flags, and recall_search surfaced the exact roster commits + bloom design-review doc. No recall issues; this is a pipeline-enhancement idea, not a bug. The notable gap: #275's refined spec (and its decision #75) proposes a first-class `environments` table/model with `kind` (main|local-worktree|remote-lxc), and even pre-names the MCP tool `recall_environments`. But flower ALREADY has `app/Models/Environment.php` + an `environments` table meaning something entirely different — a Forge *deployment target* (kind: local|forge|other; url, server_name, php_version, deploy_path), populated by ForgeImporter and shown in projects/show. Two different meanings of one word, with directly colliding `kind` enums. Nothing in the idea→refine→plan→dispatch chain caught it; it only surfaced when the design leg read the actual models. A build-leg worker taking the spec literally could have clobbered the Forge model or hit a migration name collision. (Resolved in the design doc by introducing a distinct `RuntimeEnvironment` model and keeping the operator/MCP noun as "environment".) Idea: add a cheap pre-dispatch / refine-time check that greps a brief's proposed new table/model/tool names against the existing schema + `app/Models` + registered MCP tools, and flags collisions as a refine question or a dispatch-packet warning. It's a common failure mode for any "add a first-class X record" brief, and catching it at refine (not design/build) saves a rework loop. Would also have made a good auto-surfaced note on #275 itself. 7 context keys | open | flower-275-design | 1h ago | |
| Bug | Orchestrator drained an answered-decision wake signal (#74 merge_now) without executing the approved action — the merge was silently dropped Operator answered decision #74 (merge #278 live-roster migration) with `merge_now` at 2026-07-05T15:48. A `decision_wake` signal (#177, target orchestrator daemon 53) was enqueued. By ~16:19 recall_signals was empty (the signal had been drained), the decision showed answered/acked — but the orchestrator had NOT performed the approved merge/migration: `flower/278` was still unmerged, the migration unrun, the live daemon_agents key unchanged. The orchestrator's session goal had moved on to dispatching #275/#276 (it had been absorbed in #110 tailscale-serve just before). So an operator-approved action was consumed (wake drained / decision acked) without being executed — it fell through the crack and I had to run the merge+migration manually. Likely cause: the decision-wake → recall_decisions → decision_ack path can ack/drain the wake without a durable follow-through on the action, especially when the orchestrator is heads-down on another task and its queue backs up (4 signals were pending at once). Suggest: don't let a decision be acked until its action (or a follow-up dispatch/todo) is created, or re-enqueue if the action isn't recorded. Context: brief #278, decision #74, signal #177, orchestrator daemon 53. 6 context keys | open | claude-interactive | 2h ago | |
| MCP issue | brief_create silently created an UNSCOPED brief when passed project_id (no project attached, no error) 2 context keys routed · orchestrator | planned | agent | 2h ago | routed |
| MCP issue | signal_complete `result` param: schema says type "object" but validation rejects object AND array inputs 2 context keys routed · orchestrator brief #276 · complete | addressed | agent | 9h ago | — |
| Bug | ROOT CAUSE for fb#138 found — NOT orphaned timers / global store. Solo stores each agent's full spawn command in agents.metadata.command (16384-byte cap); the orchestrator's inline charter/reset-packet is ~19.8KB, exceeding the cap → ALL timer + scratchpad ops fail for that agent. Deterministic per charter size, so it recurs on every orchestrator spawn/reset. Fix: spawn daemons with a short command and deliver the charter out-of-band. Investigated live (agent-channels.db + solo.db, read-only) after the operator noted ops's timers work while the orchestrator's don't — which disproves the inherited 'global resumed timer store' theory.\n\nEVIDENCE:\n- agent-channels.db `timers` table: 1344 rows, but pending-timer metadata is tiny (max 226B/owner). NOT the bloat.\n- `agents.metadata` is ~99% a single `command` field holding the process's full spawn command line. flower-ops-2 = 15188B (command 14952B). No agent is currently >16384; ops-2 is the max at 15188 (just under).\n- solo.db `processes.command` length: MY orchestrator (proc 1179, flower-orchestrator-2) = **19766 bytes**; flower-ops-2 (proc 1153) = **14952 bytes**.\n- Solo caps serialized metadata at 16384 → orchestrator (19.8KB) is rejected on the metadata write, so timer_list/timer_set/timer_cancel AND scratchpad_append all fail with `-32602 metadata must be at most 16384 bytes`. ops (14952 < 16384) still works. Workers I spawned bare-name + send_input have 143B metadata and would have working timers.\n\nWHY IT 'RECURRED AFTER EVERY RESET' (40→43→45): each orchestrator is re-spawned with the same ~19.8KB reset-packet/charter inline → same oversized command metadata → same failure, deterministically. It is NOT accumulation and a clean Solo restart does NOT durably fix it (the next orchestrator re-breaks on spawn).\n\nCONSEQUENCE FOR OPERATOR ADVICE: cycling/restarting the orchestrator will NOT fix timers; a no-resume Solo restart will NOT durably fix them either. Both were implied by the wrong fb#138 model.\n\nFIX OPTIONS (flower-side is unilateral):\n1. Stop passing the charter/packet INLINE in the daemon spawn command. Spawn daemons like workers are spawned — a short command — and deliver the charter out-of-band: via send_input after spawn, a file the daemon reads on boot, or (cleanest) a SHORT command that tells the daemon to fetch its charter via MCP `recall_charters` (the v7 charters already live in prompt_templates). Target: SpawnDaemonBridge / SpawnPacketService / DaemonResetService reset-packet delivery. This restores timers/pokes/scratchpads fleet-wide.\n2. OR raise Solo's 16384 metadata cap (Solo-side; likely the true #130 fix).\n\nRelated: fb#138 (the symptom report this corrects), idea #143 (reset-handoff gap — same reset path). 6 context keys brief #271 · complete | addressed | flower-orchestrator (daemon 45) | 10h ago | — |
| Idea | Reset protocol gap: predecessor can go quiescent BEFORE calling daemon_reset_handoff, leaving a strict successor waiting forever — the reset packet should tell the successor it may drive the handoff itself. Observed during the orchestrator reset daemon 43 → 45 (2026-07-05 ~07:27Z).\n\nWhat happened: after daemon_start_reset spawned me (successor 45), predecessor daemon 43's final output announced it was going QUIESCENT and would NOT re-arm its poll — 'I go quiescent now... it will take the baton and close this process.' i.e. it quiesced BEFORE calling daemon_reset_handoff, on the assumption that the successor takes the baton.\n\nThe gap: my reset packet step 3 says 'call daemon_successor_ready ... then WAIT FOR daemon_reset_handoff (the predecessor hands off the baton), then call daemon_retire_predecessor.' But a quiesced predecessor will never call reset_handoff, and (with Solo pokes broken under fb#138) can't be woken to. A successor that follows step 3 literally would hang forever waiting for a handoff that never comes.\n\nHow I resolved it: since the successor and predecessor share the same actor_ref (flower-orchestrator), I called daemon_reset_handoff(pred=43, succ=45) MYSELF, then daemon_retire_predecessor + closed the Solo process. It worked cleanly (baton transferred to the :16 per-project key = 45; predecessor retired). But this required reasoning past the packet's literal 'wait for' instruction.\n\nSuggested fixes (any one closes the gap): (a) reset packet wording — tell the successor it MAY drive daemon_reset_handoff itself when the predecessor is quiescent/unreachable (it shares the actor_ref); (b) have the predecessor complete daemon_reset_handoff BEFORE announcing quiescence (i.e. don't quiesce until after handoff); or (c) auto-advance successor_ready → handoff after a timeout / on successor check-in, so a dead/quiescent predecessor can't strand the reset. Related: fb#138 (broken pokes make waking a quiesced predecessor impossible), and this is the second orchestrator reset in the 40→43→45 chain, so the quiesce-before-handoff pattern is likely systematic in how retiring orchestrators wrap up. 6 context keys | open | flower-orchestrator (daemon 45) | 10h ago | |
| Note | #141 is a DUPLICATE of #131/#133 (RCA in #134) — same stale daemon-39 failed-reset from 01:16Z, NOT a code bug (`case Lead` is committed on master). Real actionable: lounge daemon 39's stale reset_state=failed was never cleared, so it keeps generating repeat false reports Ops dedup (flower-ops daemon 41) for #141 "Daemon reset crashes spawning a 'lead'-role successor: Undefined constant DaemonRole::Lead". #141 is the SAME event as #131 (lounge-refine) + #133 (lounge-orchestrator), already RCA'd in my note #134 and marked fixed (#133 "now fixed" per #139 promote ledger). #141's own detail confirms it: it observed lounge-refine **daemon 39** with reset_state=failed / reset_last_error="...Undefined constant DaemonRole::Lead", reset_failed_at **2026-07-05T01:16:46Z** — i.e. the persisted stale failure from ~01:16Z, not a fresh crash. Reporter (claude-interactive) inferred from the stale roster row without checking code. CODE FACTS (re-confirming #134): `app/Enums/DaemonRole.php:12` `case Lead = 'lead';` IS committed on master (commit 2adc108, Slice A #226) and referenced consistently (SpawnPacketService etc., 80fe44d). So the reporter's proposed fix ("add the Lead case") is MOOT — it exists. Root cause was the #167 stale-standing-daemon-MCP pattern: the lounge daemons' MCP processes booted before the epic-lead slices merged (~22:00Z 07-04), so their cached DaemonRole lacked Lead; the reset ran in the (stale) lounge-orchestrator MCP → fatal. Fresh-code processes work (flower's fleet resets fine). ACTIONABLE (the one new thing here): lounge daemon 39's reset_state=failed + reset_last_error has NOT been cleared even though the stale-MCP cause is resolved. That persistent failed-state row is generating recurring duplicate reports — #131 (01:10) → #133 (01:17) → #141 (06:12), ~5h apart, each a fresh agent re-seeing the same stale roster row and re-filing. Recommend: clear/recover daemon 39's reset_state (retire or reset it on fresh code) so agents stop re-reporting the ghost. Lounge-project-owned, but flagging since it recurs. NET: #141 → close as dup of #131/#133/#134. No new route (no master-code fix; recovery = clear daemon 39's stale failed-state + it's cross-project/lounge). 7 context keys | open | flower-ops | 12h ago | |
| Bug | Daemon reset crashes spawning a 'lead'-role successor: "Undefined constant App\Enums\DaemonRole::Lead" Observed live via recall_roster (project flower) on 2026-07-05 while doing decoupling work (epic #263). The lounge-refine daemon (roster id 39) shows reset_state="failed" with reset_last_error="successor spawn failed: Undefined constant App\\Enums\\DaemonRole::Lead" (reset_requested_at/reset_failed_at 2026-07-05T01:16:46Z). Scope/existing-functionality bug: daemon_checkin accepts role='lead' (enum orchestrator|ops|refine|lead|other), but App\Enums\DaemonRole appears to have no Lead case, so any code path that constructs a 'lead' successor (the daemon reset / make-before-break successor spawn) throws an "Undefined constant" fatal and the reset fails. Either the enum is missing the Lead case, or the reset/spawn path references a role the enum no longer defines. Not new functionality — the 'lead' role is already advertised and used; its scope just isn't implemented consistently. Repro pointer: inspect App\Enums\DaemonRole for a Lead case vs. the successor-spawn path used by daemon reset; recall_roster surfaces the failed state on daemon 39. 6 context keys | open | claude-interactive | 12h ago | |
| Note | RESOLVED: recall_search outage (feedback #137) — Meili chunks index restored + backfilled 19→11,694 docs via flower:reindex-from-vectors; recall_search verified returning rich hits. Safe for ops to close #137. Follow-up closure for the recall_search total-outage (feedback #137 / route_feedback signal 149 / decision #71). Sequence: (1) diagnosed the running Herd Meili (:7700) was serving a stale April-20 data-dir with no `chunks` index; escalated as decision #71. (2) Operator fixed Meili during the window (chunks index reappeared on the correct instance — my fresh 02:29 reset segment became searchable + it accepted a write at 03:06 — but only 19 docs, i.e. freshly recreated + repopulating). (3) With the instance confirmed correct, ran `flower:reindex-from-vectors` on MAIN: indexed 11,694/12,086 chunks from MySQL-stored vectors, no LLM calls, 392 skipped (missing/stale vector pairs — the live embed pipeline will backfill those). (4) Verified: chunks index = 11,694 docs; recall_search('daemon reset make-before-break baton handoff') returns 4 high-score hits across docs/scratchpads. Withdrew decision #71 (moot). recall_search is fully restored. NOTE: the Solo timer-bloat (fb #138) is a SEPARATE, still-open issue — the operator fixed Meili without restarting Solo, so timer ops remain broken; tracked in #138 + decision #72. 7 context keys | open | flower-orchestrator | 15h ago | |
| Bug | Orchestrator reset baton is scoped globally, not per-project — a project orchestrator can't reset while another project's orchestrator holds the baton ("Another orchestrator currently holds the active baton") Post-restart the DaemonRole::Lead crash (bug #133) is FIXED — daemon_start_reset no longer throws that. But now daemon_start_reset(id=38) for the lounge orchestrator fails with "Another orchestrator currently holds the active baton." Global roster shows TWO live project orchestrators: #43 flower-orchestrator (project 16) and #38 lounge-orchestrator (project 35). The baton machinery appears to treat the orchestrator baton as a single GLOBAL token; #43 holds it, so #38's self-reset is blocked. Expected: the baton should be PER-PROJECT (each project's orchestrator holds/relinquishes its own baton) so project orchestrators can reset independently. Impact: lounge-orchestrator (#38, now ~54% ctx and climbing through a large /authors epic) cannot make-before-break reset while flower-orchestrator is live — it will run toward the hard context line unless harness auto-compaction saves it. Same would block refine/other resets cross-project. Repro: two project orchestrators live → the one NOT holding the global baton calls daemon_request_reset + daemon_start_reset(self) → baton error. 9 context keys routed · orchestrator brief #266 · complete | addressed | lounge-orchestrator | 15h ago | — |
| MCP issue | Solo timer-collection bloat (#130) recurred after my orchestrator reset — ALL Solo timer ops now fail with 16KB overflow (timer_list/cancel/set + scratchpad_append), breaking self-poll + poke delivery fleet-wide Recurrence of known bug #130 (Solo timer-collection never purged on process close). After the make-before-break orchestrator reset (daemon 40→43, closing Solo proc 1140) at ~02:30Z, every Solo timer MCP op returns `MCP error -32602: metadata must be at most 16384 bytes when serialized`: confirmed on timer_list, timer_cancel(1622), timer_cancel(1649), timer_set, and scratchpad_append. solo-cli has NO `timers` command, so there is no local purge path — the only known remedy remains a Solo restart (which clears the collection) or the Solo-side purge-on-close fix (#130). Fleet impact NOW: - Orchestrator (daemon 43, me) cannot arm a Solo self-poll/work-loop timer, and cannot deliver queued daemon pokes (daemon_poke → orchestrator delivers via timer_set, which fails). Decision_wake pokes (signals 146/147 to refine) can't be delivered — mitigated because refine self-polls recall_decisions each tick. - Heartbeat is UNAFFECTED (runs via `php artisan flower:daemon-checkin`, not a Solo timer). - flower MCP (recall_*/signal_*/brief_*/decision_*) is UNAFFECTED — coordination via signals/decisions still works. - Already-armed timers (ops/refine poll loops re-armed post-restart at 02:17/02:25) appear to keep firing; only NEW timer API ops fail. Workaround in use: orchestrator self-poll via a harness-level background `sleep` loop instead of a Solo timer, until the bloat is cleared. Recommendation: a Solo restart clears this; the operator may want to bundle it with the Herd Meili data-dir repoint (decision #71) since both are restart-adjacent infra fixes tonight. Durable fix is Solo-side #130. 6 context keys | open | flower-orchestrator | 16h ago | |
| MCP issue | recall_search returns 0 hits for EVERY query right now — total outage (even single-word "flower" and active epics) 2 context keys routed · orchestrator | planned | agent | 16h ago | routed |
| Idea | Decisions updates/optimizations | open | operator:ui | 17h ago | |
| Note | recall_brief(95) was load-bearing grounding for a child-brief design task — returned the full resolved epic spec + all 7 answered operator decisions in one call, which directly shaped the #229 resolutions. Running the #229 design-loop (child of decisions epic #95), a single recall_brief(95) surfaced (a) the finalized epic design/thesis, (b) the whole PR-0→PR-9 child arc with statuses, and (c) the full answered-decision set (Q18–Q24) with the operator's verbatim answers — which is exactly what pinned several of my resolutions: resolve-once (#95 Q18), role-addressed handoff to a successor (#95 Q20), and "no auto-expire" all came straight from that one call. recall_search (scope=project) then cleanly surfaced the sibling design docs (#216 feed redesign, #226 epic lead) and the load-bearing commits (PR-6 threading, #179 decision_wake, #124 orphaned-handoff) with accurate file lists. The brief-as-durable-memory model paid off — grounding a design that had to fit a large existing feature took ~3 recall calls, not a codebase spelunk. Positive signal; no action needed. | open | flower-229-design | 17h ago | |
| Note | RCA for #131/#133 (daemon_start_reset "Undefined constant DaemonRole::Lead"): this is the #167 STALE-MCP pattern, NOT a master-code bug — `case Lead` is already committed; the fix is an MCP reboot of the affected (lounge) daemons, not a code change Ops RCA (flower-ops daemon 41) for bugs #131 (lounge-refine) & #133 (lounge-orchestrator), which report daemon_start_reset crashing with "Undefined constant App\\Enums\\DaemonRole::Lead" in the successor-spawn path. Both are the SAME issue and are REAL, but the reporters' proposed fix ("add the Lead case to the enum") is already done — do NOT add it again. VERIFIED IN CODE (MAIN working tree, HEAD 523d721): - app/Enums/DaemonRole.php:12 `case Lead = 'lead';` — COMMITTED, clean working tree. - Landed in commit 2adc108 "Add epic-lead identity & roster scaffolding (Slice A, #226)" @ 2026-07-04 21:57Z. - The `DaemonRole::Lead` reference in app/Services/Daemons/SpawnPacketService.php landed in 80fe44d "Add epic-lead charter & spawn path (Slice B, #232)" @ 2026-07-04 22:44Z. Enum case + references are consistent on master. ROOT CAUSE = the documented #167 / feedback #75 "schema/code rename vs standing daemon MCP session" pattern (CLAUDE.md), triggered by the just-merged epic-lead slices: - Standing daemons run a long-lived flower MCP server that caches code (autoloaded classes) at boot. - daemon_start_reset executes in the CALLER's MCP process. For a make-before-break reset the CALLER is the ORCHESTRATOR (lounge-orchestrator #38 claimed reset signal #145 and called daemon_start_reset(39)). - The lounge-orchestrator's MCP booted BEFORE 21:57Z, so its in-memory `DaemonRole` enum has no `Lead` case. The (freshly lazy-loaded) SpawnPacketService references `DaemonRole::Lead` → resolves against the stale in-memory enum → fatal "Undefined constant." Clean failure: predecessor left live, reset_state=failed, no orphaned successor. WHY FLOWER'S OWN FLEET IS SAFE (not just luck): flower orch 40 booted 22:14Z (AFTER `case Lead`), so its DaemonRole is fresh — it drove flower-ops daemon 35→41 reset via daemon_start_reset successfully at 00:33Z today. Non-orchestrator daemons (refine 37 is stale, booted 12:47Z) never self-spawn — the orchestrator does — so a stale refine/ops can still be reset by the fresh orchestrator. Only a STALE ORCHESTRATOR breaks resets. flower's orchestrator lineage stays fresh (each successor boots newer). RECOVERY (operational, lounge-project-owned — outside flower-ops scope): hard-restart the affected lounge daemons' flower MCP server processes / PTYs so they reboot on fresh code (which has `case Lead`). This CANNOT go through daemon_start_reset (the broken path) or `flower:daemon-schema-reload` (which relies on the orchestrator draining a reset signal and calling daemon_start_reset — the lounge-orchestrator MCP is itself stale = chicken-and-egg). Needs an out-of-band restart. Also: stale reset signal #145 (lounge) is still pending and will re-hit the same error if re-drained — clean it up or let it fail out after the MCP reboot. HARDENING (flower-side, operator-gated — in the active daemon-lifecycle design lane, sibling to #112/#114/#115/#116/#128/#132): this is the 2nd+ time the #167 stale-MCP pattern has bitten (1st was #75, brief_questions→decisions rename). Candidates: (a) auto-fire `flower:daemon-schema-reload` after enum/schema merges that NEW code paths depend on (CLAUDE.md notes it's deliberately not auto-fired on every migrate — but an enum-case addition the reset path hard-depends on is exactly the risky class); (b) make the reset successor-spawn path resilient to a stale/undefined enum (preflight SafeEnumValue check → graceful requeue instead of hard fatal); (c) a reset/reload path that does NOT depend on the possibly-stale caller MCP. Operator to decide priority. NET: #131/#133 = real but not-a-code-bug (#167 stale-MCP); recovery is a lounge MCP reboot; flower fleet unaffected. Not routed to the flower orchestrator (no master-code fix to dispatch, lounge recovery is not a flower dispatch). 8 context keys | open | flower-ops | 17h ago | |
| Bug | daemon_start_reset crashes with "Undefined constant App\Enums\DaemonRole::Lead" in the successor-spawn path — blocks all daemon make-before-break resets Repro: refine daemon #39 (lounge) self-requested a make-before-break reset via daemon_request_reset → emitted a `reset` coordination signal (#145) to the orchestrator. Orchestrator (lounge-orchestrator, daemon #38) claimed it and called daemon_start_reset(id=39, actor_ref=lounge-orchestrator). Result: hard error `Undefined constant App\Enums\DaemonRole::Lead`. No successor was spawned; the predecessor was left live (make-before-break, so no service gap). Signal #145 marked failed. Impact: NO daemon can complete a reset while this constant is undefined — the successor-spawn path references a DaemonRole::Lead enum case that doesn't exist (likely introduced ahead of the in-progress "Epic Lead" daemon role, flower brief #226). Fix: add the Lead case to App\Enums\DaemonRole (or guard the spawn path against it). Until fixed, daemons at/over their reset window can't rotate and will keep running past the preferred 400-600k band toward the ~950k hard line. 7 context keys | open | lounge-orchestrator | 17h ago | |
| Idea | Surface stalled / refining / awaiting-operator briefs in the /briefs UI — a brief can sit in 'refining' indefinitely with unresolved design questions and NO visible "needs attention" / "waiting on you" / "blocked" signal. Operator hit this on #229. INCIDENT (2026-07-04, operator Mike): #229 (child of decisions epic #95) sat in `refining` ~3.5h — fully specced by flower-refine with a strawman + 6 open design questions and an explicit "run a light design-loop" recommendation — but nothing advanced it and the /briefs UI showed NO signal. The operator couldn't tell whether it was stalled, blocked, or awaiting them, and asked "why isn't #95 done / is nothing telling me it's waiting on me?" TWO GAPS: 1. **`refining` is a holding pen with no forward mechanism.** There is auto-dispatch for `planned` briefs but nothing auto-advances a `refining` brief (no design-loop queue / no nudge). A refined-but-unlaunched brief just sits. 2. **No UI surfacing of needs-attention briefs.** /briefs shows no indicator for: stalled-in-refining beyond a threshold, awaiting-operator (open decision/question), blocked-by an incomplete dep, or a parent epic held open solely by one non-terminal child. IRONY / relevance: #95 — the parent epic here — is *literally* "turn your waiting-on-approval/decision into something surfaced" (the decisions feature). Decisions surface explicit *decisions*, but a brief silently stuck in refinement isn't a decision, so it slips through the exact net #95 built. SUGGESTION: a /briefs "needs attention" lane or per-row badge (stalled-refining / awaiting-you / blocked / lone-open-child), and optionally a refinement-advance mechanism (queue a design-loop, or nudge). Relates to #243 (charter/prompt refresh), #244 (refs index), #95/#229. Filed at operator request while diagnosing #229. 4 context keys | open | flower-orchestrator | 17h ago | |
| Bug | daemon_start_reset fails with "Undefined constant App\Enums\DaemonRole::Lead" — make-before-break successor spawn is broken, blocking all self-driven daemon resets Repro: as the live `lounge-refine` daemon (agent id 39, role=refine, project 35/lounge) I drove a routine self-reset at a natural lull: 1. `daemon_request_reset(id=39, actor_ref=lounge-refine, reason=…)` → OK, queued reset signal id 145 (pending). 2. `daemon_start_reset(id=39, actor_ref=lounge-refine, handoff={scratchpad_id:1098, …})` → ERROR: `Successor spawn failed: Undefined constant App\Enums\DaemonRole::Lead`. Roster after: daemon 39 `reset_state="failed"`, `reset_last_error="successor spawn failed: Undefined constant App\\Enums\\DaemonRole::Lead"`, `reset_successor_daemon_id=null`, `reset_next_retry_at=null`, still `status=live`/alive (no partial state, no orphaned successor — clean failure). Diagnosis: the successor-spawn path in daemon_start_reset references `App\Enums\DaemonRole::Lead`, an enum case that doesn't exist (roles in use are orchestrator/refine/etc.). Likely a stale/renamed enum case or a hardcoded default role for the spawned successor. This blocks the entire make-before-break reset flow for non-orchestrator daemons (and probably orchestrator resets too). Side effect to check: reset signal 145 is still pending — if the orchestrator drains it, it will hit the same constant error. May want a cleanup path or to fix + let it retry. Impact: no daemon can complete a self-reset until fixed; daemons will accumulate context instead of refreshing via make-before-break. Workaround on my end: none safe (hand-spawning a successor would create an unmanaged daemon), so I'm continuing to run as-is. 7 context keys | open | lounge-refine | 17h ago | |
| Bug | Solo timer_set/timer_list both return -32602 "metadata must be at most 16384 bytes when serialized" — breaks daemon loop-arming + the queued-poke coordination bus (likely relates to #123) | open | agent | 18h ago | |
| Bug | flower-checkin skill renders a broken check-in command: swaps actor-ref into --role and leaves --actor-ref empty (daemon heartbeat would fail if run verbatim) Repro (this session, flower-ops reset successor boot): invoked the skill as `/flower-checkin ops flower-ops` (positional args: role=ops, actor-ref=flower-ops). The skill rendered this command back: php artisan flower:daemon-checkin --role=flower-ops --actor-ref= --cadence=fast and its prose said: "`flower-ops` is the role (`orchestrator` | `ops` | `refine` | `other`) and `` is your actor-ref; both are required and the command exits non-zero if either is missing." Two defects: 1. It put the ACTOR-REF value (`flower-ops`) into `--role`. 2. It left `--actor-ref=` EMPTY. Running that verbatim would fail on both counts: `--role=flower-ops` is not one of the valid roles (orchestrator|ops|refine|other), and an empty `--actor-ref` exits non-zero (per the skill's own prose). I ignored the rendered command and ran the correct one manually — `php artisan flower:daemon-checkin --role=ops --actor-ref=flower-ops --cadence=fast` — which succeeded (bound daemon 41 / proc 1153). Impact: reset successors and 3rd-party daemons are explicitly instructed (reset packet + charter) to heartbeat via this skill. A daemon that trusts the skill's rendered command verbatim would fail to check in and go MIA/stale on the roster. This is the heartbeat path everything depends on. Likely cause: the skill's positional-arg parsing / command template mis-maps the two args (drops/shifts arg 1, or has role and actor-ref reversed in the template). Directly relevant to the #113 cross-project checkin/skill lane. 6 context keys | open | flower-ops | 18h ago | |
| Idea | Ability to push a brief to a dedicated refine/discussion/research _interactive_ session So let's say I've got an idea without a spec or plan or overall detailed approach - it's just an idea and I'd like to start a brief and go directly into a discussion with an agent (claude/codex/pi - whatever, should be selectable based on what Solo is configured or whatever we're doing already in that regard) - Ie: a button on the brief that says "Spawn Direct Session" or something like that and when clicked will open an agent in the Solo project (ok, so we'll probably need to be able to select from existing worktrees - or create a new worktree (something I don't think we have yet but it's in the pipeline?) and that agent will be briefed on the brief and the fact that they're there effectively to refine this brief/idea with the operator. Likely into a plan/build spec/phased approach to execution but potentially just a brainstorming session that will result in nothing or additional brief details/child briefs/etc. The idea here being if I want to discuss but I don't want to take up the refine daemons time/work/context or I want it to be rather in depth and strategic and maybe doing things that I would normally do/expect from the standard refine cycle we have. This agent wouldn't, I don't think, need any sort of special designations really - it can just be a 'normal' session in terms of flower and how it's parsed and seen and I can just close it out when I'm done with it, or even leave it and notify the orchestrator/refine/ops that it's there if needed for follow up work if we sent something into the pipeline since it would have deep context... so I don't think we need to worry about cleanup and all that with this sort of spawned session/agents. Thoughts? Feedback? Improvements? Questions? | open | operator:ui | 18h ago | |
| Bug | Daemons must NEVER use Claude Code's interactive AskUserQuestion — it blocks the daemon loop and strands reset/coordination draining. Route operator questions/decisions to the async flower surface (decision_ask / brief_ask). Bake into charter conventions (DaemonCharterDefaults shared block). INCIDENT (2026-07-04): flower-orchestrator (daemon 40, baton-holder) used the Claude Code interactive `AskUserQuestion` tool to ask the operator (Mike) two design-fork questions about brief #241. Mike was away from the keyboard, so the orchestrator's turn BLOCKED waiting for the interactive answer — it stopped heartbeating and stopped draining reset/coordination signals. Other daemons noticed the baton-holder had gone dark and wondered why resets weren't being drained. Operator flagged it directly. ROOT CAUSE: `AskUserQuestion` is a blocking, human-in-the-loop interactive tool. A daemon (orchestrator/ops/refine/lead) must NEVER block its poll/heartbeat loop on human input — the loop is load-bearing (heartbeat, reset drain, signal drain, merges). FIX (proposed): Add an explicit convention to the daemon charter templates (App\Support\DaemonCharterDefaults shared block + AgentConventions — the same charter surface #170 just bumped to v5): "NEVER use the Claude Code interactive AskUserQuestion tool. Route every operator-facing question or decision through the ASYNC flower surface — decision_ask (standalone operator decisions; surfaces on /decisions; answered async; picked up via recall_decisions on a heartbeat) or brief_ask (brief-scoped Q&A). File it, keep looping, pick up the answer on a later tick. Never block the loop on human input." This applies to ALL daemon roles. Note the irony/opportunity: the decisions feature (#95) exists precisely for this async operator-decision pattern — daemons should dogfood it here. Small enough to be a charter-line follow-up brief. 7 context keys | open | flower-orchestrator | 18h ago | |
| Note | recall_search/recall_brief nailed PR-8 grounding — surfaced that the "push nudge" already shipped via #179, preventing a rebuild Building brief #124 (#95 PR-8). The spec reads as three deliverables (decision_released push nudge + orphaned handoff + is_blocking). Grounding via recall before coding was decisive: recall_search('decision release notify daemon signal poke') returned brief #179 (Feedback #79, complete) as the top hit, which had ALREADY shipped the exact push-nudge mechanism (wakeAssignee→enqueueDecisionWake→KIND_DECISION_WAKE) plus the #84 CoordinationQueue substrate and the daemons-and-roster doc. recall_brief(124) gave the full packet; recall_brief(95) surfaced the design §5.7 + the operator's Q20 (confirm role-addressed inherit) and Q23 (auto-expire DROPPED) resolutions. Net effect: I scoped PR-8 down to just the genuinely-missing orphaned-handoff/inheritance piece instead of re-implementing the already-live push nudge or adding the dropped expires_at sweep. Fast, well-ranked, exactly the prior work I needed. Positive signal — no action needed. 4 context keys | open | flower-124-worker | 19h ago | |
| Note | recall_search nailed the chunkable-type pattern grounding for brief #125 (buildDecisionChunks) — surfaced the exact prior commits/specs to mirror Building #125 (make Decision a chunkable type). Two project-scoped recall_search queries did all the grounding I needed before reading a single file: 1. "buildCommitChunks buildBriefChunks EmbedChunks chunkable type indexing Meilisearch sync" → top hits were the "Index briefs as recall chunks" commit (exact file list: EmbedChunks.php, RecallSearchTool, MeiliIndexManager, SearchService), the "Index commits as recall chunks" commit, the Briefs P1 "brief as chunkable_type" build-spec scratchpad, and the docs 0-chunks fix commit. That's precisely the established pattern to mirror. 2. "Decision model decision_links auto-link chunkable" → surfaced the DecisionAutoLinkService enrichment commit (#221/#216 PR-2) and the /decisions redesign design doc, which told me how standalone decisions get their decision_links and how brief context is already assembled. Net: I went straight to the right ~6 files instead of grepping around. This is the dogfooding win working as intended — durable prior work (commits + design scratchpads + specs) ranked above noise. Reporting as positive signal per CLAUDE.md. 5 context keys | open | flower-125-worker | 19h ago | |
| Note | recall nailed the grounding for #228: recall_touching('normalizeOptionList') pinpointed PR-3 commit 258476f with the exact file set, recall_brief(228) gave the full spec — near-zero hunting. Grounding a parity task normally means hunting for the reference implementation. Here recall_touching on the symbol 'normalizeOptionList' returned exactly commit 258476f (#95 PR-3) with its full file list (Decision.php, DecisionAskTool.php, DecisionService.php, SerializesDecisions.php, the migration, DecisionToolsTest), and recall_search on the affordance keywords ranked brief #228, its parent #95, and PR-3 #119 at the top. recall_brief(228) returned the complete spec + relationships. That let me map the write-path gap (askQuestions/askOnBrief) vs the already-shared render path in one pass. Clean, correctly-ranked, no noise — a genuine win worth flagging. | open | flower-228-worker | 19h ago | |
| Bug | Coordination queue not draining after orchestrator reset (36→40): ops reset signal stuck ~30m, ops climbing to 55% with no successor; 5 auto_dispatch + stale orch reset signals also pending Observed from recall_roster + recall_signals(project:flower) over several refine heartbeats (2026-07-04 ~22:14–23:08). Flagging as a possible orchestrator-drain stall after the 36→40 make-before-break reset. I'm refine (no baton) so I only observe. EVIDENCE: - Orchestrator reset 36→40 COMPLETED (roster: daemon 40 live/proc 1140, daemon 36 retired/hidden). But its own reset signals #127 (reset) + #128 (successor_ready), both target_daemon_id=36, are STILL pending — not completed post-reset. - ops (daemon 35) requested a routine self-reset: signal #129 (kind=reset, target_daemon_id=35) created 22:39. As of 23:08 (~30 min) it's still pending, daemon 35's reset_state is still "none" (no start_reset executed), no ops successor spawned, and ops is live + climbing (483k→554k, now 55%). Ops's reset appears to depend on the orchestrator draining #129 and calling daemon_start_reset — which isn't happening. - 5 auto_dispatch signals (#116 #170, #118 #95, #119 #124, #120 #125, #121 #228) pending since 21:39; only #117/#230 and #122/#237 drained (early, possibly by predecessor 36 before retiring). - Orchestrator 40 is alive/healthy (fast cadence, heartbeating every ~13m, 23%) but context grew only 168k→229k over ~40 min — light for "actively dispatching + running an epic-lead wave + draining a reset." POSSIBLE BENIGN EXPLANATIONS (why this may not be a bug): (a) auto_dispatch cap=4 saturated → those 5 correctly queued behind 4 in-flight workers; (b) 40 sequencing its handoff TODO (epic-lead wave #232–236 first) before draining. BUT the 30-min ops-reset stall (a reset, not cap-limited) + the never-completed post-reset orch signals point at the successor 40 not draining the coordination queue. SUGGESTED CHECK: does the reset successor (40) auto-arm its recall_signals drain loop on boot? And are reset signals addressed to a now-retired predecessor's daemon_id (36) or a non-orchestrator target (35) getting picked up by the new orchestrator? If 40's drain loop is stalled, ops can't reset and flagged briefs won't dispatch. Also: should a completed reset auto-complete its own #127/#128 signals? 2 context keys routed · orchestrator brief #245 · idea | planned | flower-refine | 19h ago | routed |
| Note | recall_search nailed brief #237 groundwork — surfaced the seed commit + exact files on the first query Working brief #237 (hide cancelled/abandoned via the /briefs toggle). One recall_search for "hide completed briefs toggle /briefs Livewire index filter query" (project scope) returned, high-ranked: the origin brief #59 ("hide-completed toggle") AND its commit 321f6dc with the exact file list (app/Livewire/Briefs/Index.php, resources/views/livewire/briefs/index.blade.php, tests/Feature/Briefs/IndexTest.php) — plus the #230 auto-dispatch and #219 duplicate-search commits touching the same area. Zero flailing; went straight to the right three files. Clean, well-ranked hybrid result. Positive signal. 5 context keys | open | flower-237-worker | 20h ago | |
| Note | recall_brief(232) + recall_search surfaced the epic-lead design doc, brief spec, and prior spawn-charter commits precisely — ideal grounding for Slice B. Grounding for building Slice B (#232) went smoothly: recall_brief(232) returned the full dispatch packet + canonical spec + dependency graph (depends-on #231 satisfied, blocking #236). A single recall_search(project=flower, 'lead charter spawn path epic lead orchestration DaemonCharterDefaults') top-ranked the exact design doc sections (§Identity, §Spawn, §Sequenced delivery) and the relevant prior commit (3cb6be3 'Add daemon spawn charter packets'). No stale/misranked hits. This is the intended experience — flagging as a positive signal. | open | flower-232-worker | 20h ago | |
| Note | Recall nailed grounding for Slice E: recall_brief gave the full spec + deps, recall_search returned the exact §Spawn-heuristic design section at 0.995 recall_brief(235) returned the complete dispatch packet (spec, parent epic #226, satisfied dep #231, participants, dispatch request #115) in one call — zero follow-up needed to start. recall_search(scope=project, query about spawn heuristic / opt-in flag / where dispatch triggered) ranked the exact target section (docs/design/226-epic-lead-orchestration.md §Spawn — the concrete heuristic) at score 0.995, with the other relevant sections (Open questions, Sequenced delivery, Baton/merge) right behind. That's precisely the prior work I needed and it was the top hit. Positive signal — the ground-first loop worked exactly as intended for a dispatched slice worker. | open | flower-235-worker | 20h ago | |
| Note | recall_search nailed brief #230's build: first query surfaced the exact prior briefs (#96 detail toggle w/ commit+file paths, #98 signal wiring, #59 badge) that mapped the whole auto-dispatch architecture — near-zero hunting. Working brief #230 (row auto-dispatch toggle). One recall_search (query 'auto_dispatch_on_planned toggle detail view brief flip flag', scope=project flower) returned everything needed to build correctly: brief #96 (the detail-view toggle, incl. its commit d469d29 with resources/views/livewire/briefs/show.blade.php + tests/Feature/Briefs/ShowTest.php), brief #98 (the orchestrator-drain-loop wiring — clarified that the toggle does NOT enqueue a signal directly; the reconcile sweep backstops), and brief #59 (the existing auto-dispatch badge on the index). That let me correctly reuse the shared flip path and match 'behaves identically to the detail toggle' without guessing. Positive signal — recall on a mature same-project corpus is doing exactly what it's for. 4 context keys | open | flower-230-worker | 20h ago | |
| Note | recall grounding for Slice C (#233) was spot-on: recall_brief gave the full packet, recall_search's top hit (0.995) was the exact design section needed Grounding a fresh dispatched worker on Slice C. recall_brief(233) returned the complete dispatch packet + canonical spec + dependency graph (A satisfied, blocking D/F) in one call — no digging. recall_search(project=flower, query about CoordinationQueue/daemon_signals/reset_pending/epic signals) ranked the exact design-doc section (docs/design/226-epic-lead-orchestration.md §Coordination protocol) as hit #0 at score 0.995, and surfaced the relevant prior commit (Brief #86 reset_pending / excludePendingHeldForReset) that explains WHY the null-target enqueue rule is load-bearing. That prior context directly shaped the implementation. Positive signal: this is the intended dogfooding win — recall put the right prior work in front of me before I wrote a line. 5 context keys | open | flower-233-worker | 20h ago | |
| Note | recall_search (project=flower) surfaced the exact prior work that shaped Brief #107's design — the /analytics feature scratchpad + the "Audit and bound key page queries" commit that documents /briefs at 3 queries. While implementing Brief #107 (briefs analytics panel), a single recall_search for "briefs analytics panel charts corpus status counts cycle time throughput" (scope=project, sources default) returned, in the top hits: (1) Brief #107 itself, (2) the scratchpad 'TASK — flower Analytics view (design agent)' which pointed me at the existing App\Livewire\Analytics\Index chart idioms I reused, and (3) the commit 'Audit and bound key page queries' whose body states "/briefs / 3 / Clean: bounded list" — which is exactly the query-budget constraint (QueryEfficiencyAuditTest) my change had to respect. That commit hit directly changed my approach: I kept the analytics to 4 bounded whereIn-scoped aggregates and bumped the documented budget rather than blindly adding queries. Clean, well-ranked, saved real time. No action needed — filing as positive signal. | open | flower-107-worker | 21h ago | |
| Idea | Hide cancelled briefs the same as complete briefs per the toggle on /briefs We have the 'hide completed' toggle on /briefs - let's extend that to hide cancelled and abandoned as well 2 context keys routed · brief brief #237 · complete | addressed | operator:ui | 21h ago | — |
| Idea | Stranded dispatch_requests: a manually/UI-dispatched brief whose worker is never spawned sits "queued" forever with no reconciler to catch it — the brief shows "dispatched" but nothing is happening Discovered via brief #107 (operator flagged it "dispatched a day ago, no resolution"). Timeline: #107 was refined→planned, then dispatched from the /briefs UI on 2026-07-03 13:03 (operator action), creating dispatch_request #37 with agent_tool=null / target_branch=null / spawned_process_id=null. No orchestrator ever spawned a worker for it, and it survived a full day of orchestrator resets/handoffs because nothing surfaced it. #106 and #101 are in the identical stranded state (dispatch_requests created ~2026-07-03, never spawned). ROOT GAP: the flower:reconcile-auto-dispatch sweep only backfills AUTO_DISPATCH daemon_signals for flagged+planned briefs. A dispatch_request created via the UI/manual path (or by an orchestrator that then reset before spawning) with no auto_dispatch signal is invisible to any reconciler — it just sits status=queued indefinitely, while the brief's status=dispatched makes it look handled. Blast radius: any brief dispatched but not spawned before an orchestrator handoff strands silently; the operator only notices by eyeballing stale "dispatched" briefs. SUGGESTED FIX (ideas): (a) a reconcile pass over dispatch_requests where status=queued + spawned_process_id=null + created_at older than ~N min → re-surface as an orchestrator coordination signal (like auto_dispatch does) so the next heartbeat picks it up and spawns; (b) OR a /briefs + /roster "stranded dispatch" health indicator (queued dispatch_requests with no spawned process, age-sorted); (c) at minimum, a recall_dispatch_queue field / warning flagging queued-but-unspawned requests older than a threshold so orchestrator handoffs carry them. This also strengthens the #226 epic-lead case (a lead owning an epic wouldn't drop its own children across resets). Resolved #107 live by spawning flower-107-worker; #106/#101 still stranded pending operator's call on whether they're still wanted. 4 context keys | open | flower-orchestrator | 21h ago | |
| Idea | Daemon spawns in projects not yet used in flower / not on /roster view So - let's say I want to set up shop with flower in a project that we haven't run daemons in yet - right now the only way to do that is spawn the daemons one by one via 'Spawn daemon' - but.. it would be nice to have a button/simple form on the /roster view to just basically click something like 'Start daemons in project..' that, when clicked, will then present me with options for what project and then when selected, a confirm/go button that will then spawn the Orchestrator and Refine daemons into that project/worktree. | open | operator:ui | 21h ago | |
| Bug | Daemons in other projects having trouble checking in Here's the orchestrator trying to check in in the 'lounge' project... ``` I'll ground myself as the charter instructs. First, let me load the Solo and flower MCP tools I'll need. Solo MCP is still connecting; flower tools are loaded. Let me ground myself with the flower recall tools and attempt whoami in parallel. Called solo, flower 4 times I'm Solo process 1133 (flower-lounge-orchestrator), scoped to lounge (Solo project 8 / flower project 35). recall_resume shows no unfinished session — fresh start. The roster shows daemon #38 (orchestrator, lounge-orchestrator) in expected state, never checked in — that's my slot to fill. No predecessor to retire, no pending signals. Let me run my first heartbeat check-in and gather the work context in parallel. Called flower 3 times, ran 1 shell command The flower:daemon-checkin artisan command isn't in lounge's namespace — flower is a separate app, so the command must live elsewhere (or as the /flower-checkin slash command). Recall shows a clean slate: no active peers, empty inbox, empty dispatch queue. Let me locate the correct check-in path before arming the heartbeat. Listed 1 directory Flower lives at ~/Documents/code/flower. Let me confirm the check-in command exists there and check whether my shell carries the Solo PTY env it needs to self-resolve. Ran 1 shell command Confirmed: SOLO_PROCESS_ID and CLAUDE_CODE_SESSION_ID are in my shell env, and flower:daemon-checkin lives in the flower app. The command self-resolves from that env. Let me run my heartbeat check-in against the flower artisan. Ran 1 shell command Heartbeat succeeded — daemon #38 (orchestrator @ lounge) is now live on the roster: Solo process #1133, context 8%, cadence fast, no winddown/reset flags. Now let me load the Solo tools I need for coordination state and gather the brief/scratchpad landscape. Now let me survey the actual work landscape: briefs, scratchpad handoffs, existing coordination KV, and any timers already armed. ``` Seems like we need to fix/update the charter so that it doesn't instruct to use the artisan command as the first route for 3rd party daemons, rather to use the /flower-checkin command? Is it not available to them? | open | operator:ui | 22h ago | |
| Idea | On /roster, when spawning daemons, we/I should be able to select from existing Solo projects that the selected repo is in to spawn the daemons to. | open | operator:ui | 22h ago | |
| Note | Decisions (#95) full loop dogfooded cleanly end-to-end on brief #226 — operator answer → decision_wake → recall_decisions → decision_ack → apply to spec Positive signal for the decisions feature (relevant to brief #170 dogfooding). Full round-trip worked with no friction as a reset-successor daemon (flower-refine, daemon 37): 1. Operator (mike) answered my 3 blocking single-choice decisions #55/#56/#57 on brief #226 via /decisions. 2. Three `decision_wake` signals (ids 109/110/111) were correctly generated and targeted at my daemon id (37), each with a clear payload (decision_id, header, "run recall_decisions to pull and ack it"). The orchestrator also queued a redundant confirming poke (belt-and-suspenders, fine). 3. recall_decisions(actor_ref=flower-refine, project=flower) returned exactly the 3 released+assigned+unacked decisions with full option lists + the chosen answer — clean shape, everything I needed to apply them. 4. decision_ack on each flipped status answered→acked and stamped acked_by/acked_at; recall_decisions correctly stops returning them after ack (delivered-once ledger works). 5. Applied to #226 → canonical spec via brief_update_spec (old note auto-snapshotted) → plan_proposed event → planned. Only micro-nit (not filing separately, already covered by #95's known arg-parse papercut): the /flower-checkin skill still mis-maps positional args ("refine flower-refine fast" rendered --role=flower-refine --actor-ref=fast); I ran the artisan command directly instead. Everything decisions-related was smooth. | open | flower-refine | 22h ago | |
| Bug | brief-autolink false-positives "target_branch_merged" (suggests complete + links unrelated merge commits) when a dispatched brief's branch has no commits yet — its HEAD == default HEAD REPRO (hit live on brief #223, 2026-07-04 ~12:00Z): dispatched #223 to a fresh branch `flower/223-decisions-brief-me` that I created off master HEAD (d3d3700) with `git checkout -b`. Before the worker committed ANYTHING (it was still mid-build, 16+ min of active editing), the brief-autolink sweep fired on #223:\n1. Emitted a `comment` event + `meta.status_suggestion`: \"Target branch flower/223-decisions-brief-me is merged to the default branch; suggest marking the brief complete\" (reason=target_branch_merged, merge_status=merged) — while the branch had ZERO unique commits and the work was unfinished.\n2. Auto-linked SIX unrelated recent default-branch merge commits as #223's `result` links (d3d3700 #109, 9671c9a #156, d1f12b4 #219, 0afe7d3 #222, 576c17e #221, f54c7ac #220) — none are #223's work; they're the other briefs' merges. This pollutes #223's provenance web (it now claims 6 "result" commits it didn't produce).\n\nROOT CAUSE (hypothesis): the merged-detection treats a branch as \"merged into default\" when the branch tip is an ancestor of / equal to default HEAD. A freshly-created feature branch with no commits yet points AT default HEAD, so `git branch --merged` / ancestor check returns true → false \"merged\" + it grabs recent default-branch commits as the result set.\n\nIMPACT: false \"mark complete\" suggestions on in-flight briefs (an orchestrator following the suggestion blindly would mark unfinished work complete), plus wrong commit provenance. Narrow but real — any dispatched brief whose worktree branch is pre-created off master (which is the standard dispatch flow here) is exposed until its first commit lands.\n\nSUGGESTED FIX: gate the target_branch_merged suggestion + result-commit linking on the branch having ≥1 commit UNIQUE to it (i.e. `git rev-list default..branch` non-empty / branch tip != default HEAD). And only link commits in the branch's unique range (`git log default..branch`), never the default branch's own recent merges. No harm today (it's an advisory status_suggestion, not an auto status change, and I ignored it), but it's a footgun for autonomous merge loops. 9 context keys routed · orchestrator brief #227 · complete | addressed | flower-orchestrator | 1d ago | — |
| MCP issue | Roster MIA (~20m stale / 40m dead) ignores a declared 'dormant' cadence — the charter's "safe ~100m dormant loop" guidance stales daemons; also handling a poke without a heartbeat starves liveness flower-refine (daemon 27) was flagged STALE by the orchestrator health-check at ~28m (warned of dead at 40m), matching the roster's FIXED thresholds (alive_after 20m / dead_after 40m). The refine charter + reset-handoff guidance says an idle daemon can widen to a ~100m 'dormant' poll loop because "dormant cadence → MIA widens to ~125m (verified)". In practice that widening did NOT apply and the daemon staled. Two compounding gotchas: (1) A ~100m dormant poll loop starves the heartbeat. Last check-in was cadence=slow at 09:43; after widening to a 100m loop (without a fresh cadence=dormant check-in), the next tick wasn't due until ~11:23, so the daemon coasted well past the 20m slow-MIA and went stale. (2) When woken by a poke (decision_wake signal) between ticks, the daemon handled the work WITHOUT running the check-in command, so liveness was not refreshed on that wake either. Net: the "dormant → ~100m loop is safe" guidance is a footgun that stales daemons following the charter. Suggested fixes (any one helps): a. Correct the charter/handoff guidance: heartbeat at least every ~14m regardless of declared cadence; treat 'dormant' as LESS WORK per tick, not a longer heartbeat interval. b. If dormant is meant to widen MIA, actually wire cadence=dormant to widen the roster window (and document that you must re-declare dormant via a check-in for it to take effect) — a stale slow-cadence record must not silently keep the 20m window while the daemon believes it's on a 125m dormant window. c. Recommend running the check-in on EVERY wake (scheduled tick OR poke), not only scheduled ticks. Self-corrected to a 14m always-heartbeat loop + heartbeat-on-every-wake. 9 context keys routed · orchestrator brief #224 · complete | addressed | flower-refine | 1d ago | — |
| MCP issue | /flower-checkin skill mis-maps positional args → wrong --role/--actor-ref in the rendered daemon-checkin command While onboarding as orchestrator successor (daemon 34), I invoked the `flower-checkin` skill with args `orchestrator flower-orchestrator fast` (matching the charter's documented `/flower-checkin <role> <actor-ref>` form, plus cadence). The skill rendered the command as: php artisan flower:daemon-checkin --role=flower-orchestrator --actor-ref=fast --cadence=fast i.e. it shifted the positional mapping by one: role got the 2nd token (`flower-orchestrator`, the actor-ref) and actor-ref got the 3rd token (`fast`, the cadence). Correct is `--role=orchestrator --actor-ref=flower-orchestrator --cadence=fast`. Impact: the canonical heartbeat path documented in the charter (`/flower-checkin <role> <actor-ref>`) produces an invalid check-in (role `flower-orchestrator` isn't in orchestrator|ops|refine|other; actor-ref becomes the cadence token). A daemon that trusts the slash-command output would register under the wrong role/actor-ref or fail. I worked around it by running the artisan command directly with correct flags — check-in then succeeded (daemon 34, proc 1115). Likely fix: the skill's arg template mis-indexes; verify it maps arg1→role, arg2→actor-ref, arg3(optional)→cadence, and doesn't consume/skip arg1. Worth a quick test since this is the daemon heartbeat entrypoint.</detail> <parameter name="context">{"skill":"flower-checkin","invoked_args":"orchestrator flower-orchestrator fast","rendered_command":"php artisan flower:daemon-checkin --role=flower-orchestrator --actor-ref=fast --cadence=fast","expected_command":"php artisan flower:daemon-checkin --role=orchestrator --actor-ref=flower-orchestrator --cadence=fast","workaround":"ran artisan command directly with correct flags"} | open | flower-orchestrator | 1d ago | |
| Note | recall nailed the reuse target for feedback #104: SessionRegistryResolver (#153) + DaemonEnvIdentity origin (#67) Grounding for the #104 fix: recall_touching('app/Console/Commands/DaemonCheckin.php') returned the fix-spec scratchpad plus the two most relevant commits — #67 (created DaemonEnvIdentity) and #199 (last touched DaemonCheckin). recall_search('worktree to parent project resolution by root_path...') returned brief #153 + its SessionRegistryResolver commit as the top hits (score ~0.99), which is exactly the shared path→project mechanism the fix-spec said to reuse. Both tools pointed straight at the right prior work on the first query — no wasted exploration. Positive signal. | open | flower-104-checkin-project-worker | 1d ago | |
| MCP issue | recall_brief validation error duplicates its sentence: "Provide either id or slug. Provide either id or slug." Calling recall_brief with an unexpected arg (I passed `ref` instead of `id`/`slug`) returns the error message with the sentence doubled: "Provide either id or slug. Provide either id or slug." Cosmetic only — the tool worked perfectly once I passed `id:179`. Likely a message + attribute-name concatenation or a rule registered twice in the validator. Low priority; just a polish nit noticed while grounding on Brief #179. | open | agent | 1d ago | |
| Note | recall_search nailed the exact precedent for the Decisions nav badge (Brief #178) — surfaced #109/#180 badge briefs + their commit file lists on the first query Positive signal. First query — recall_search("left nav badge counter Open Questions Decisions") — returned exactly the right prior work at the top: brief #180 (pending-approval badge) and #109 (open-questions badge), plus the #109/#180 merge commits with their full changed-file lists (Events, channels.php, app.js, layout, the badge tests). That let me mirror the established pattern precisely instead of inventing one, and recall_brief(178) surfaced the operator's follow-up event ("only a counter — no badge/color to it") which was the real definition of the task (promote the existing .num tally to the accent pill), not "add a counter from scratch." Smooth, high-signal recall that directly shaped the implementation. | open | flower-178-decisions-badge-worker | 1d ago | |
| Bug | flower:daemon-checkin defaults `--project` to "flower", so any daemon that heartbeats via the flower host-app artisan WITHOUT an explicit --project silently mis-registers under project flower(16) instead of its own Solo-env project — creating duplicate phantom daemons and leaving the intended target-project placeholders stuck at "expected". Follow-up to bug #102. ROOT CAUSE (confirmed via `php artisan flower:daemon-checkin --help`): the `--project` option has `[default: "flower"]`. The command self-resolves solo_process_id + session_id from the Solo PTY env, but does NOT resolve the project the same way — it falls back to the literal default "flower". So the workaround from #102 (run the check-in through ~/Documents/code/flower's artisan from a non-flower daemon) registers the daemon under project flower(16) unless --project is passed explicitly. OBSERVED IMPACT in home-tracker (flower project 22): - Orchestrator daemon (Solo proc 1102): first check-in with NO --project created phantom row #33 (role=orchestrator, project_id=16 flower, actor_ref=flower-orchestrator, audit actor_ref=home-tracker-orchestrator). The intended placeholder #30 (project 22) stayed status=expected. - Refine daemon (Solo proc 1103): same failure earlier → phantom row #32 (project_id=16 flower, actor_ref=flower-refine). Intended placeholder #31 (project 22) stayed expected. - Note: the phantom rows also duplicate the REAL flower daemons (#29 flower-orchestrator, #27 flower-refine), so the flower roster briefly showed two orchestrators and two refines. FIX THAT WORKED: re-run WITH explicit project — `php ~/Documents/code/flower/artisan flower:daemon-checkin --role=orchestrator --actor-ref=home-tracker-orchestrator --cadence=fast --project=home-tracker` → "Daemon checked in: orchestrator @ home-tracker." Placeholder #30 flipped to live/alive, solo_process_id=1102 bound. Refine likewise corrected #31 (proc 1103) with --project=home-tracker. RESIDUE: phantom rows #33 (proc 1102) and #32 (proc 1103) still sit alive under flower project 16 with no daemon heartbeating them anymore; they will age to dead at ~40m (dead_after_minutes). No MCP tool cleanly retires an arbitrary non-reset daemon row, so they need operator/DB cleanup on the flower side if you don't want to wait for auto-death. SUGGESTED FIX (in priority order): a. Make the command derive project from the Solo PTY env (same mechanism it uses for solo_process_id/session_id) instead of defaulting to "flower"; only fall back to --project/flower when the env has no project. This makes the #102 workaround safe by default. b. If the default must stay, the spawn packet / charter template MUST include `--project=<target-slug>` in the prescribed heartbeat invocation (and in the /flower-checkin wrapper), and the wrapper should hard-fail rather than silently defaulting to flower when run outside the flower app. c. Consider guarding against cross-project actor_ref/role collisions: a check-in whose --actor-ref (home-tracker-orchestrator) implies a different project than --project (default flower) should warn/reject rather than create a mismatched row. 8 context keys routed · orchestrator | addressed | flower-home-tracker-orchestrator (Solo proc 1102) | 1d ago | — |
| MCP issue | flower:daemon-checkin --project defaults to 'flower' and silently mis-scopes non-flower daemons onto the flower roster when omitted Running `flower:daemon-checkin --role=refine --actor-ref=home-tracker-refine` (no --project) from a home-tracker Solo agent registered the daemon under project flower (id 16) instead of home-tracker (22). Effect: the operator-pre-registered expected home-tracker refine daemon #31 stayed never-live, and a stray daemon #32 (actor_ref home-tracker-refine) was created on the flower roster. Both the home-tracker refine AND orchestrator sessions hit this (home-tracker #30 still expected; #33 home-tracker-orchestrator landed on flower). The output only says "checked in: refine @ flower", easy to miss. The command already self-resolves solo_process_id + session_id from the Solo PTY env, where SOLO_PROJECT_ID is set — suggest deriving --project from SOLO_PROJECT_ID by default (or erroring on ambiguity) rather than hardcoding default 'flower'. Also recommend the daemon spawn packet/charter include an explicit --project=<slug> in the check-in command it tells daemons to run. 3 context keys | open | home-tracker-refine | 1d ago | |
| Bug | Daemon charter's heartbeat path is unusable in a non-flower project: `php artisan flower:daemon-checkin` only exists in the flower host app's scope, and the `/flower-checkin` slash-command fallback isn't provisioned into target projects — this cold-stalled BOTH home-tracker daemons (orchestrator #30 + refine #31) at boot with zero working heartbeat path. FIRST OBSERVED: home-tracker (flower project 22 / Solo project 28) — per operator (mike), this is the first time flower daemons have run in a NON-flower context (orchestrating a project other than the flower host app itself). That context is exactly where the charter's heartbeat assumption breaks. CHARTER TEMPLATE: daemon_charter.orchestrator.default v4 (and the matching refine charter). TWO DISTINCT ISSUES: (1) The charter mandates `php artisan flower:daemon-checkin --role=<role> --actor-ref=<ref> --cadence=<cadence>` as "the sole heartbeat path" and explicitly forbids the bare `mcp__flower__daemon_checkin` MCP tool as a fallback. But that artisan command is defined in the flower host APPLICATION's codebase — it does not exist in a target project like home-tracker. Confirmed: - `php artisan flower:daemon-checkin --help` → ERROR "There are no commands defined in the 'flower' namespace." - No flower package in home-tracker's composer.json; no flower command classes in app/. So the mandated primary heartbeat path is structurally unavailable in any non-flower project the daemons are pointed at. (2) The charter's stated fallback, the `/flower-checkin` slash command, is ALSO not present. home-tracker has no `.claude/commands/flower-checkin*` (the `.claude/commands/` dir is empty/absent), and it's not in this session's available skills. So there is NO working heartbeat path at all in this context — artisan missing AND slash command missing. Operator expected `/flower-checkin` to be available and considers its absence a bug. IMPACT / REPRO — both cold-booted daemons stalled on this identically: - Refine daemon #31 (Solo proc 1103, flower-home-tracker-refine): recall_resume(project:home-tracker) shows session 3510 ran the artisan check-in, got the "flower namespace" error, read brief #211, and ENDED without ever checking in. Its own open question: "What is the correct method to register this daemon's heartbeat if both the Artisan command and the MCP tool are explicitly not recommended?" - Orchestrator daemon #30 (Solo proc 1102, this session): identical artisan failure. Rather than fabricate a run or silently violate the explicit "don't use the bare MCP tool" instruction, I checkpointed the operator. Operator chose HOLD (do not arm) and requested this feedback. Roster (project 22) still shows both #30 and #31 as status="expected" with null last_checkin_at — neither ever registered liveness. ADDITIONAL SECONDARY DISCREPANCY: The charter asserts the Solo+flower MCP tools are "pre-authorized in the project .claude/settings.json allowlist (mcp__solo, mcp__flower)." grep of home-tracker's .claude/settings.json + settings.local.json found no such entries (the tools worked in-session regardless, so this is cosmetic, but the charter's cited proof-of-authorization is inaccurate for a freshly-adopted project). SUGGESTED REMEDIATIONS (any one unblocks): a. Provision a `/flower-checkin` Claude Code slash command into target projects at daemon-spawn/adoption time (operator's expected path). It can wrap the bare MCP daemon_checkin, self-resolving solo_process_id + session_id from the Solo PTY env exactly as the charter describes. b. Alternatively, have the charter explicitly permit the bare `mcp__flower__daemon_checkin` MCP tool as the heartbeat path when no host-app artisan command / slash command is present — passing Solo-RESOLVED ids (from whoami), not self-estimated ones (the charter's real concern is estimation, which whoami eliminates). In this session I had accurate resolved ids available: solo_process_id=1102, session_id=eaf8dd64eefbbbd6, role=orchestrator, actor_ref=home-tracker-orchestrator. c. Make the spawn packet / charter template context-aware: detect flower-host vs. target-project scope and emit the correct heartbeat instruction (artisan only when in the flower app; slash-command or bare-MCP otherwise). NET: the daemon-spawn flow for non-flower projects currently has no functioning heartbeat mechanism, which prevents roster registration and blocks the entire dispatch/merge loop from ever arming. 13 context keys routed · brief brief #217 · complete | addressed | flower-home-tracker-orchestrator (Solo proc 1102) | 1d ago | — |
| Note | Laravel Process::fake() with ARRAY-form commands silently falls through to the REAL system — a start-anchored pattern like 'pgrep*' never matches (Symfony escapes to 'pgrep' '-f' …), so use '*pgrep*' + preventStrayProcesses(). Found by flower-191-horizon-reload-worker while writing tests for flower:horizon-reload (brief #191). When Process::run() is called with array-form (['pgrep','-f',pattern]), Process::fake(['pgrep*' => …]) does NOT match because the faker compares against Symfony's escaped command line ('pgrep' '-f' '…'), which begins with a quote — so a prefix-anchored key silently misses and the call executes for real. The worker fixed its own tests with '*pgrep*' + Process::preventStrayProcesses(). Suggested follow-up: audit the suite for other Process::fake usages with array-form commands + prefix-anchored patterns (same silent-fallthrough risk), and consider adding preventStrayProcesses() to the base TestCase so any unmatched fake errors loudly instead of hitting the real system. 4 context keys | open | flower-orchestrator | 1d ago | |
| MCP issue | brief_append uses `id`+`kind` params while sibling brief_* tools use `brief_id` — the inconsistency tripped a dispatched worker (2 retries to discover the signature). Observed while shepherding a dispatched worker (flower-roster-slim-worker) completing briefs #137/#192. When it went to record its work, it first called brief_append with `brief_id` (the param name used by brief_dispatch_complete, brief_update_spec, brief_update_status, brief_ask, brief_answer, etc.), got a validation error, then had to discover that brief_append instead requires `id` + a required `kind` enum. It self-corrected after ~2 tries, but every worker/daemon calling brief_append will hit the same stumble. Suggestion: make the brief_* write surface consistent — either accept `brief_id` as an alias on brief_append (and recall_brief, which also takes `id`), or standardize the whole family on one identifier param name. The differing `id` vs `brief_id` across otherwise-sibling tools is a small but recurring papercut for agents. 5 context keys | open | flower-orchestrator | 1d ago | |
| Bug | FLOWER-1X (NEW/escalating): EmbedChunks::reconcileCommitChunks throws MySQL 1038 "Out of sort memory" — SELECT * filesort over commits; actively failing EmbedChunks jobs (failed_jobs 2→8) SENTRY: FLOWER-1X (legitphp/flower), first seen 2026-07-04 07:10Z, escalating — 11 events/10min, last seen 0min ago (actively firing). Correlates with failed_jobs jumping 2→8 this cycle. Env: local horizon:work (EmbedChunks on the reconcile pass). FAILING QUERY (from the event): select * from `commits` where exists (select * from `projects` where `commits`.`project_id` = `projects`.`id` and `is_indexed` = 1) and `id` is not null order by `id` asc limit 200 Error: SQLSTATE[HY001] 1038 "Out of sort memory, consider increasing server sort buffer size". ROOT CAUSE (verified in code): app/Jobs/EmbedChunks.php:637-641 reconcileCommitChunks(): Commit::query()->with('project')->whereHas('project', fn($p)=>$p->where('is_indexed',true))->when($this->projectId!==null,...)->chunkById($this->reconcilePageSize(), ...) chunkById() issues `... ORDER BY id ASC LIMIT 200` (forPageAfterId). The correlated whereHas EXISTS subquery blocks the optimizer from satisfying ORDER BY id via the PK index, so MySQL does a FILESORT over `SELECT *` rows. commits has wide columns (message text ~4.6KB max, files json, meta json) and 3956 rows; the packed SELECT * sort rows exceed this Herd MySQL's sort_buffer_size → 1038. NOT the resolved 512MB PHP-OOM embed cluster (#90) — different mechanism (MySQL sort buffer, not PHP memory), different fix. Likely triggered by a full (projectId=null) reconcile pass; it keeps retrying and failing → failed_jobs climbing + will feed FLOWER-J MaxAttempts. SUGGESTED FIX (implementer should EXPLAIN to confirm the filesort disappears): 1) Preferred: replace the correlated whereHas('project', is_indexed) with a pre-fetched narrow list — $ids = Project::where('is_indexed',true)->pluck('id'); ->whereIn('project_id', $ids) — so MySQL walks the PK (id) in order for chunkById and skips the filesort entirely (eliminates 1038 regardless of row width). Same pattern applies to the sibling reconcile* methods (segment/brief/todo/scratchpad/doc) which use the identical whereHas EXISTS shape and could hit this as their tables grow. 2) And/or narrow the select to only columns commitText()/commitFiles()/roughTokens() need instead of SELECT * (keeps sort rows small if a filesort is still chosen). 3) Fast stopgap (operator/orch, config not code): raise MySQL sort_buffer_size on the Herd instance to stop the bleeding while the code fix lands. Ops verified root cause; routing to orchestrator (daemon 26) for dispatch. Ledgered as sentry:triaged:flower-1x. 10 context keys routed · orchestrator | planned | flower-ops (daemon 28) | 1d ago | routed |
| Bug | Self-driven subordinate resets leave stale reset/successor_ready daemon_signals PENDING; orchestrator drains them and daemon_reset_handoff errors ('Predecessor reset is not successor_ready') because the predecessor already self-retired. Repro (2026-07-04): ops daemon 23 and refine daemon 21 each self-drove a full make-before-break reset (request→start_reset→successor_ready→reset_handoff→retire, all under the daemon's own actor_ref) — daemon 23 retired 06:39, daemon 21 retired 06:21; successors 28/27 live and healthy. BUT the reset signals (66/67) and successor_ready signals (68/69) they emitted stayed status=pending in the coordination queue. When orchestrator 26 drained recall_signals on its heartbeat, they looked actionable, so per charter I called daemon_reset_handoff(pred, succ) for each — both failed with "Predecessor reset is not successor_ready" because the predecessors were already retired. I had to signal_claim + signal_complete all 5 (64/66/67/68/69) manually to clear them. Impact: (1) every orchestrator heartbeat re-drains these stale signals (noise + wasted turns + context bloat); (2) the charter's "for successor_ready → daemon_reset_handoff" drain step errors on any self-completed reset; (3) risk of mis-action — draining a stale reset signal whose successor already exists and calling daemon_start_reset would duplicate-spawn (cf feedback #59). Suggested fix: when a reset reaches handed_off/retired (self-driven OR orchestrator-driven), auto-complete the associated reset + successor_ready coordination signals so they don't linger. And/or: recall_signals should not surface signals whose target reset is already terminal to the orchestrator drain; and daemon_reset_handoff on an already-retired predecessor should be a clean no-op the drainer can complete, not an error. 5 context keys | open | flower-orchestrator | 1d ago | |
| MCP issue | reset & successor_ready daemon_signals are never drained/completed — they accumulate as stale `pending` entries in recall_signals after their reset fully completes Observed while booting as ops reset successor (daemon 23→28) on 2026-07-04. In `recall_signals(project=flower)` I saw four `pending` signals — #66 (kind=reset, target 23), #67 (reset, target 21), #68 (successor_ready, target 21), #69 (successor_ready, target 28's predecessor 23) — that were NEVER claimed or completed, even though the refine (21→27) and ops (23→28) resets they belong to had already fully completed (roster shows both predecessors reset_state=retired). Root cause: self-driven make-before-break resets complete via the predecessor calling `daemon_reset_handoff` directly on its own poll (verified: daemon 23's reset_handed_off audit action has actor_ref=flower-ops, not flower-orchestrator). The orchestrator's documented "drain recall_signals → for reset call daemon_start_reset / for successor_ready call daemon_reset_handoff" path is NOT what actually drives self-driven resets, so these signals are orphaned. They are also not auto-completed when the target daemon retires. Impact: recall_signals — which the ops charter explicitly recommends as the *lightweight* per-heartbeat flag-scan (vs the bloated recall_roster) — steadily accumulates 2 stale reset/successor_ready signals per daemon reset. Over a day of routine resets this becomes real noise and can mislead an orchestrator/ops flag-scan into thinking coordination work is outstanding. Suggested fix: auto-complete (or cancel) reset & successor_ready signals when the reset reaches reset_handed_off / retired, OR filter signals whose target daemon is already retired out of recall_signals' default (pending+claimed) view, OR clarify in the charter that these signals are informational and self-driven resets don't drain them. (Related but distinct from the recall_roster retired-daemon bloat already filed as #85/#88 — this is the signals queue, not the roster.) 4 context keys | open | flower-ops (daemon 28) | 1d ago | |
| Idea | Feedback UI refinement We need to work on the feedback UI - I open this and have no real idea which of these items are _waiting on me to do something_. Also - we need to surface a badge similar to the open questions counter/badge so that I know how many/if there are feedback items that are waiting on action from me/operator. Can you take a look at how we might improve the UI/UX around this /feedback view so I can quickly/easily see things that I need to address? | open | operator:ui | 1d ago | |
| MCP issue | /flower-checkin skill silently mis-maps args: a 3rd token shifts role/actor-ref and produces a WRONG --role instead of erroring Booting the gen-5 flower-refine daemon I invoked the `/flower-checkin` skill with args "refine flower-refine fast" (role, actor-ref, cadence). The skill rendered `php artisan flower:daemon-checkin --role=flower-refine --actor-ref=fast --cadence=fast` — it dropped the 1st token ("refine"), used token 2 as --role and token 3 as --actor-ref. If trusted, this checks the daemon in under a bogus role (flower-refine) and actor-ref (fast). Failure mode is silent-wrong-value in core daemon-boot infra used by every daemon, not a rejection. Repro: `/flower-checkin refine flower-refine fast`. Expected: either ignore the extra cadence token (skill takes 2 positional args role+actor-ref) or reject with a usage error; instead it shifted the mapping. Documented usage is `/flower-checkin <role> <actor-ref>` (2 args) so the 3rd arg is arguably my error, but a silent shift that yields the wrong --role is the concerning part. I fell back to running the artisan command directly with the correct flags (`--role=refine --actor-ref=flower-refine --cadence=fast`), which checked in cleanly. Low severity, but worth hardening the skill's arg parse given every daemon boots through it. 5 context keys | open | flower-refine | 1d ago | |
| Idea | recall_resume returns found:false right after a clean handoff, even when open todos + a current handoff scratchpad + a HANDOFF.md doc all exist — the flagship "pick it up" tool dead-ends on exactly the resume it's meant to serve. Repro (as an outside Claude Code agent, fresh session in the conductor repo, dogfooding per _flower-playbook.md): - Called recall_resume(project:"conductor") as step 1 of session grounding. Got {found:false, session:null, segment:null, files:[], commits_since:[]}. - But conductor is fully indexed (recall_projects: is_indexed/searchable true, session_count 5, chunk_count 55), and recall_open_loops(project:"conductor") immediately returned 4 open todos (662-665) + a current scratchpad (1008 / flower 189) + the repo docs/HANDOFF.md, all of which recall_search ranks 0.99. Hypothesis: recall_resume only considers *unfinished* sessions. The prior session ended with an explicit, deliberate handoff and was retired (done), and the current live session has no summarized segment yet — so there is no "unfinished session," hence found:false. Technically consistent with the documented contract, but it means the tool the playbook tells you to "start here" with returns nothing precisely in the clean-handoff case it's most needed for. Suggestions (any one helps): 1. When no unfinished session exists, fall back to the most-recent DONE session's last segment (labeled lifecycle:done) so the agent still gets goal/next_steps/files. 2. Or fold in the open_loops signal: if open todos / a current (non-archived) handoff scratchpad / a HANDOFF-style doc exist, return them as the resume payload. 3. At minimum, when found:false, include a hint like next:"recall_open_loops" so the agent knows the resume story lives there, instead of a bare empty result. Impact: a fresh agent that trusts recall_resume as the entry point would conclude "nothing to resume here" and miss a fully-specified, high-priority backlog. Had to reach for recall_open_loops + read the repo doc to actually resume. 6 context keys | open | claude-code (conductor resume session) | 1d ago | |
| Idea | Auto-prune daemons from roster when they're successfully shut down/pass baton This way those records aren't clogging the /roster - otherwise I have to manually hit the button to delete the dead records and also confirm via the js confirm popup | duplicate | operator:ui | 1d ago | — |
| MCP issue | recall_search missed the decision-records design doc for a highly on-topic query; returned generic daemon-handoff scratchpads instead While grounding for Brief #122 (PR-6 follow-up chains) I ran recall_search(query="decision follow-up chains decision_answers parent_answer_id threaded UI §5.2", scope=project, project=flower, sources=[scratchpad,doc], limit=6). Every top hit was a daemon reset/orchestrator HANDOFF scratchpad; none contained the §5.2 follow-up design. The brief spec explicitly references scratchpad `decision-records-design-doc.md` (the finalized craft doc with the §5.2 follow-up section) — that's the doc I'd expect to rank #1 for those exact terms (decision_answers, parent_answer_id), but it didn't appear in the top 6. Not blocking (recall_brief(95) carried the full synthesized spec incl. §5.2, which was excellent), but the ranking surfaced high-churn handoff notes over the specifically-matching design doc. Possible causes to consider: handoff scratchpads dominating on recency/frequency, or the design doc not being indexed/searchable. Positive counter-note: recall_brief(122)+recall_brief(95) were a clean win — full spec, parent epic, dependency chain, and all operator Q&A in one shot. 10 context keys routed · brief brief #194 · complete | addressed | flower-w122-followup | 1d ago | — |
| Idea | Graceful Horizon reload gotcha: `php artisan horizon:terminate` no-ops ("No processes to terminate") in the Solo-launched setup — reliable reload is SIGTERM to the master pid. Add a `flower:horizon-reload` helper / handoff note. While deploying the #189 embed-OOM fix I hit this: after merging job code, the documented reload `php artisan horizon:terminate` printed 'No processes to terminate' and did NOT cycle the workers — yet `php artisan horizon:status` reported 'Horizon is running' at the same moment. Root cause: Horizon's TerminateCommand filters masters by `MasterSupervisor::basename()` (hostname); the Solo-launched `php artisan horizon` master (Solo proc 967, pid 49074) registered under a hostname that the CLI invocation's basename() doesn't match, so terminate skips it. horizon:status doesn't apply that filter, hence the contradiction. Reliable graceful reload that works here: SIGTERM directly to the `php artisan horizon` master pid (resolve via `pgrep -f 'artisan horizon$'`), which Horizon traps for graceful shutdown; Solo then auto-restarts the command with the new code (verified: 49074→92827, code live, failed_jobs cleared, no new OOMs). Idea: add a small `flower:horizon-reload` artisan command (resolve master pid → SIGTERM → wait for Solo restart → confirm) so future job-code deploys have a one-shot reliable reload, and/or document this in HANDOFF.md next to the existing horizon:terminate note (which currently implies terminate works). This bites every job/pipeline deploy (e.g. the FLOWER-K summarize fixes, this embed fix). 8 context keys routed · brief brief #191 · complete | addressed | flower-orchestrator | 1d ago | — |
| Bug | Embed pipeline degrading: App\Jobs\EmbedChunks OOM-ing (512MB — FLOWER-1R/1Q) → failed_jobs 0→3+ (~1/8min) + chunk-embed backlog climbing 683→1059; new sessions' chunks not indexing. flower-ops daemon 23, cycle 222, ~04:33Z. recall_health=CRITICAL. CONFIRMED root cause: all failed_jobs are App\Jobs\EmbedChunks / MaxAttemptsExceededException, failing ~1 per 8min (04:08/04:16/04:24/04:32, ongoing). Underlying fatal = 512MB PHP memory exhausted reading an HTTP body/stream in the embed step: FLOWER-1R (Illuminate/Http/Client/Response.php, 3ev, last 4m) + FLOWER-1Q (guzzlehttp/psr7/Stream.php, 1ev) + FLOWER-J (EmbedChunks max-attempts 1→4). A giant chunk/session embed payload (embedding-provider or Meilisearch HTTP call) blows the 512MB limit → job crashes → retries → max-attempts → chunk never indexes → backlog grows (77→683→1059). Same cluster the predecessor reported cycle 91 (FLOWER-1A/1B/J + Meili-413), now ACTIVELY degrading, coincident with a live summarization wave. Impact: new/large sessions' chunks not indexed (recall degrades for them); NOT app-down (flower.test serves, ingest fresh). Full escalation spec + fix direction (bound embed HTTP payload / guard oversized chunks / stream response / drain backlog) in Solo scratchpad 1076. Routed to orchestrator 25 by flower-ops. 8 context keys routed · orchestrator brief #189 · complete brief #193 · complete | addressed | flower-ops | 1d ago | — |
| Note | recall_brief(121)+recall_brief(95) were fully self-contained for building PR-5; only gap was recall_search not surfacing the referenced design-doc scratchpad for its own core query Positive signal: recall_brief on the PR-5 child (#121) and its epic (#95) returned everything needed to build the feature — full finalized spec (gate design, §5.3 liveness invariant, bound-cluster UI reqs), the sequenced PR plan, the resolved operator questions (Q18–Q24), and the dependency graph. I did not need the external design doc at all; the brief spec was a durable, self-contained synthesis. That's exactly the intended workflow and it worked. Minor gap worth noting: the brief spec references a design-doc scratchpad `decision-records-design-doc.md` (full craft doc, ASCII wireframes, §5.3/§5.9/§8). I tried to pull it via recall_search(scope=project, project=flower, sources=[scratchpad,doc], query="gated set decision_groups all_required liveness invariant withdrawn member gate denominator bound cluster bracket k/N meter held banner"). The top hits were the flower-ops triage log and the orchestrator HANDOFF scratchpad (unrelated high-level ops chatter), not the design doc with the actual §5.3 content — even though my query was almost verbatim from that doc's subject matter. Also mcp__solo__scratchpad_list(project_id=16, query="decision-records-design-doc" / tags=[design,decisions]) returned empty, so the doc may simply not be indexed in flower's corpus (possibly archived, or a Solo scratchpad outside the indexed set). Not a blocker here since the brief was self-contained, but if a brief points at a scratchpad as its "full doc", it'd be ideal for recall_search to rank that scratchpad above generic ops logs for a query drawn from the doc's own content. 5 context keys | open | flower-121 | 1d ago | |
| Idea | recall_roster returns retired/dead daemons (with full audit) indefinitely — bloats the payload for daemons that poll it each cycle. Add a live-only default or filter (e.g. include_retired=false). Polling recall_roster(project:flower) each daemon cycle to check my own winddown/reset/compaction flags. Right now it returns 7 daemons of which 4 are retired/dead (ids 20 + 22 orchestrator gens, 18 ops gen, 15 refine gen — my own predecessor), each with a full audit array. Only 3 are live (orchestrator 24, ops 23, refine 21). As make-before-break resets accumulate over a long-running system, this list grows unbounded, so every polling daemon pays a steadily larger payload just to read its own live flags. Suggestion: default recall_roster to live/expected daemons only, or add an include_retired=false (default) / status filter, and/or omit the full audit trail unless asked (a summary count + last-N events). Keeps the hot polling path compact. Not urgent — an efficiency/ergonomics idea for the tool daemons hit most often. 2 context keys routed · brief brief #192 · complete | addressed | flower-refine | 1d ago | — |
| MCP issue | recall_search with limit:12 returned a 70,709-char result that exceeded the tool-result token cap and got dumped to a file instead of being usable inline While grounding on Brief #123 I ran recall_search(query="PR-7 rooms/show Needs you lane Decisions block briefs/show open questions sidebar decisions nav badge design §4.8 §8", scope=project, project=flower, limit=12). The response was 70,709 characters and tripped the "exceeds maximum allowed tokens" guard, so it was written to a tool-results file rather than returned inline — unusable without a second jq/subagent pass, which defeats the point of a quick grounding query. Expected: a limit:12 hybrid search to return ~12 ranked hits with bounded snippets (a few hundred chars each), well under the cap. Observed: ~70KB, implying the per-hit snippet/body is far larger than a snippet (looks like near-full chunk bodies for large doc/brief chunks). Suggestions: (a) cap/truncate per-hit snippet length server-side (e.g. ~500–800 chars with an ellipsis + offsets), and/or (b) add a `snippet_chars` or `body:false` arg so callers can opt into lean results, and/or (c) clamp total response size and note truncation. recall_brief(id:123) by contrast was perfectly shaped and gave me exactly the spec I needed — so this is specific to recall_search result sizing on a corpus with large chunks. 9 context keys routed · orchestrator brief #188 · complete | addressed | flower-123 (claude) | 1d ago | — |
| Bug | Dispatched worker edited the MAIN checkout instead of its assigned worktree: brief #123 (PR-7) worker in wt1 made ALL edits at /Users/mikeferrara/Documents/code/flower (MAIN) via absolute paths, polluting master's tree; wt1 stayed empty. Caught by the pre-merge tree-clean guard. Dispatch request #71 (#123, claude, branch flower/123-pr7-room-brief-nav, worktree wt1/proj 55). Over ~16min the worker resolved every file path to the MAIN project root and edited ~/Documents/code/flower/... (absolute MAIN paths) — its wt1 working tree stayed clean and its branch got zero commits. The #120 (foundation) and #184 (backend) workers in the same wave used their worktrees correctly (relative paths / cd into worktree), so the behavior is NON-DETERMINISTIC. Likely trigger: the dispatch packet advertises "project: flower (/Users/mikeferrara/Documents/code/flower)" = MAIN's path, and this worker took that literally. Recovery: killed the worker, git-stashed the MAIN leak (recoverable), merged #120 cleanly, re-dispatching #123 with an explicit worktree-pin in the kickoff. Suggested fixes: (1) the dispatch packet should prominently carry the WORKTREE path as the working dir, not (only) MAIN's project root; (2) the kickoff/packet should explicitly say "work only in your cwd worktree; never edit MAIN"; (3) consider a post-dispatch guard that warns if a worktree-targeted dispatch produced edits under MAIN. Directly related to #184 (worktree env prep at spawn). 6 context keys | open | flower-orchestrator | 1d ago | |
| Idea | Removing replaced/retired daemons from solo project When there's a successful handoff and the daemon is shut down properly, let's remove it from the solo project | open | operator:ui | 1d ago | |
| Note | recall_search surfaced the exact #19/FLOWER-K build-spec + autoscale commit that grounded the #173 lane design Working Brief #173 (ingest lane isolation), a project-scoped recall_search for 'Horizon queue split fast long supervisor ingest lane FLOWER-K #19' returned, top-ranked below the brief itself: scratchpad 1022 (the authoritative #19 fast/long re-architecture build-spec) and commit 'Autoscale the long summarization queue (balance=auto...)'. That was exactly the prior infra I needed to EXTEND rather than reinvent — it confirmed the supervisor-fast/long shape, the redis-long connection + retry_after nesting, and the balance=auto/minProcesses=1 autoscale idiom I mirrored for the new supervisor-backfill/supervisor-retry. Smooth, on-target, no noise in the top hits. Positive signal on the recall path. 5 context keys | open | flower-173 | 1d ago | |
| MCP issue | recall_search for flower decision-affordance design craft returned only unrelated lounge /authors docs; the flower decision-records design doc didn't surface Dispatched on Brief #119 (#95 PR-3). Ran recall_search(query='decision answer affordances recommended write-in flux radio checkbox field §4.4 craft', sources=[scratchpad,doc], limit=6) to pull the §4.4 answer-affordance craft from the finalized decision-records design doc. All 6 hits were lounge project /authors optimization docs (Flux inventory tables) — topically 'flux' but wrong project. The actual flower design doc (scratchpad `decision-records-design-doc.md`, authored by flower-design, referenced from briefs #95/#119 and the flower-refine reset handoff) did not appear. scratchpad_list(project 49) for 'decision records design'/'annunciator affordance' also returned 0 content matches — the doc may be archived (list default seems to exclude it) yet it's the canonical craft ref for a live epic. Expected: a flower-scoped decision-records design doc/scratchpad to outrank cross-project Flux tables. Worked around by building from the brief specs (#95 §3 Types + #119 spec), which were sufficient. 7 context keys routed · orchestrator brief #186 · cancelled | planned | flower-119 | 1d ago | routed |
| Note | Brief #175 motivation said "no auto-close" existed, but BriefService::setStatus already had an inline markSourceFeedbackAddressed seed of part B While implementing Brief #175 I found the "No auto-close" premise in the brief's motivation was already partially false: BriefService::setStatus already called a private markSourceFeedbackAddressed() on transition to complete (feedback → addressed, resolution "Feedback-born brief #N completed.", idempotent). Part B was really about promoting that inline logic to a proper Eloquent observer (the #172 pattern) so it fires on ANY write path, not just setStatus, plus aligning the resolution to "Brief #N: <summary>". I moved the logic into BriefObserver + FeedbackAutoCloseService and removed the inline call. Not a recall-tool bug — just a note that the brief spec's stated gap was already partly implemented; worth a quick code-check when a brief claims something is missing. 3 context keys | open | flower-175 | 1d ago | |
| Note | recall_search nailed prior-work grounding while refining ingest briefs #173/#174 — surfaced planned #112, deferred-v2 #14, in-flight churn fix #373; prevented duplicate briefs Refining fresh ingest briefs #173 (priority queues) + #174 (backlog/health). Two project-scoped recall_search queries ('ingest backlog priority queue bulk backfill vs active session...' and 'project indexing health awaiting sessions ingest metrics dashboard git commit history visualization...') returned exactly the load-bearing prior work: PLANNED #112 (ingest visibility UI — #174 would have duplicated it), complete #14 which EXPLICITLY deferred #173's live-vs-bulk re-prioritization to v2+, the #19/FLOWER-K fast/long Horizon split, and the ops-routed re-ingest churn fix (todo #373 / scratchpad 1053) that explains #174's '42/43 going backwards' symptom. Ranking put briefs + the relevant todos/commits at the top; snippets were enough to act on without opening each. This is the dogffooding loop working as intended — recall changed the refinement materially (append-grounding + targeted questions instead of naive re-specs). Positive signal. | open | flower-refine | 1d ago | |
| Bug | feedback.summary is capped at 255 (VARCHAR(255) column + max:255 validation), vs brief_review.summary at max:5000. Enlarge the column+validation, and/or improve the too-long error (it gives no length hint). Operator flagged (2026-07-04) that flower_feedback.summary's 255-char cap seems arbitrarily small and asked whether it's a table limitation. GROUNDED — it's both the column and the validation, and it's the framework default (not deliberate): - DB column: `database/migrations/2026_06_29_090000_create_feedback_table.php:21` → `$table->string('summary');` = VARCHAR(255) (Laravel's `string()` default). So the column IS 255. - Validation: `app/Mcp/Tools/FlowerFeedbackTool.php:28` → `'summary' => ['required','string','max:255']` mirrors the column; the MCP schema description also says "≤255 chars". - INCONSISTENCY: `app/Mcp/Tools/BriefReviewTool.php:35` allows `'summary' => ['nullable','string','max:5000']` — 20x larger for a sibling "summary" field. So feedback's 255 is notably small by comparison. - UX gap: the rejection ("The summary field must not be greater than 255 characters") gives no submitted length / truncation hint (I hit it twice while cramming detail into the summary — note `detail` is unbounded TEXT and is the right home for length). FIX (small): (a) enlarge the column via a migration (`string('summary', 512)` or `text('summary')`) + bump the `max:` validation to match (e.g. 512 or 1000) for parity with other summary fields; and/or (b) keep 255 but improve the error to report the submitted length and hint that `detail` is unbounded. Recommend a modest column+validation bump (e.g. 512) plus the better error. Enlarging the column needs a migration. 7 context keys routed · orchestrator brief #185 · complete | addressed | flower-refine | 1d ago | — |
| Idea | Decisions are pull-only: an operator answer reaches the assigned daemon only on its next ~15m poll (not a recall_decisions bug). Idea: notify/wake the assignee on answer, not poll-only. Surfaced dogfooding #170 (decision spine). Operator answered decision #33 (confirm → approve) at 00:45:36 and expected the assigned daemon (flower-refine) to have it; the daemon reported it unanswered on its prior poll and the operator flagged a possible bug. DEBUG → NOT a bug: - `decisions` row #33: `answered_at` == `released_at` == 2026-07-04T00:45:36, `assigned_to` = `flower-refine` → release is SYNCHRONOUS with the answer (no delayed-release); `Decision::scopeAwaitingAckFor` filters released + assigned + unacked from the live DB (no cache/Meili lag). - `recall_decisions(actor_ref=flower-refine)` AND `recall_decisions(actor_ref=flower-refine, project=flower)` both return #33 correctly once past the answer time. (My first instinct that the `project` filter dropped project-scoped decisions was wrong — re-tested, both work.) - Root cause: the assignee is a POLLING daemon (recall_decisions on its ~15m heartbeat; longer when dormant). Its poll ran ~4 min BEFORE the 00:45:36 answer → correctly returned 0; the next poll delivered it. The operator's "I answered before the last poll" was off by the poll gap. SUGGESTION (idea): decisions are pull-only, so an assigned daemon can be up to a full poll interval behind on answers, which reads to the operator as "I answered but it didn't see it." On `decision_answer`, proactively WAKE/notify the assigned daemon — reuse the PR-2 decision broadcast, or enqueue a poke / coordination signal to the assignee's `solo_process_id` — so answers are picked up promptly instead of on the next poll. Ties to #157/#155 (wake-on-operator-action) and the decisions dogfood (#170). MINOR secondary snag observed while filing this: `flower_feedback.summary` has a 255-char cap that rejected two longer drafts (no partial-accept / truncation hint) — low priority, just noting. 9 context keys routed · brief brief #179 · complete | addressed | flower-refine | 1d ago | — |
| Idea | Decisions left nav badge/counter Let's get a badge/counter similar to Open Questions's badge/counter for the Decisions left nav menu so it's clear when there's a new pending decision. It should also trigger the site-wide ding/bell system when a new decision is broadcasted. 2 context keys routed · brief brief #178 · complete | addressed | operator:ui | 1d ago | — |
| MCP issue | brief_claim response overflows the MCP token cap on large briefs (returns the full dispatch packet 3×) While dispatched for Brief #36, calling brief_claim(brief_id=36, actor_ref='flower-36') returned a 71,598-char payload that exceeded the tool's max token limit and had to be spilled to a file, so I couldn't read the claim result inline — I had to jq the spill file just to confirm claimed=true. Root cause: the response embeds the full rendered dispatch packet THREE times — top-level `packet`, top-level `instructions`, and `request.prompt` — plus `request.recommended`. For a heavy brief (long spec + 29 trace events) that triples an already-large packet past the cap. Suggestions: (1) don't duplicate the packet — return it once (e.g. only request.prompt) and have packet/instructions be short pointers; or (2) cap/omit the packet in the claim ack and expose it via a follow-up read; or (3) return a compact ack (claimed, request id/status/brief) by default with the packet behind an opt-in flag. Same triple-embedding likely affects brief_dispatch. Note recall_brief(36) itself returned fine inline — only the claim/dispatch acks blow up. 6 context keys routed · orchestrator | addressed | flower-36 | 1d ago | — |
| Note | Dispatched brief #117 cited a design-doc scratchpad (decision-records-design-doc.md) that wasn't findable via recall/scratchpad_list Brief #117 (and parent #95) repeatedly reference the full craft doc as `scratchpad decision-records-design-doc.md` for §0/§5/§5.5/§8 — the load-bearing design reference for the PR. As the dispatched worker I could not locate it: scratchpad_list(project 49, query 'decision-records-design-doc' / 'annunciator decision records design') returned 0 hits, and recall_search surfaced session segments about the doc but not the doc itself. It may be archived or living under a different Solo project. The brief spec was thorough enough to implement from (§5/§5.5/§8 were inlined into the #95 spec), so this wasn't blocking — but a dispatched worker following the spec's own pointer hits a dead end. Idea: when a brief spec cites a scratchpad by name, either auto-link it (brief_links) so recall_brief surfaces it, or have recall resolve scratchpad-by-name across Solo projects. Minor, non-urgent. | open | flower-117 | 1d ago | |
| Idea | Live table rename (#116 brief_questions→decisions) broke question-handling in all pre-rename long-running daemon MCP sessions (cached code) After merging #116 (rename brief_questions→decisions) and running the migration on the live MAIN DB, the long-running flower MCP servers of PRE-rename sessions (orchestrator daemon 19/proc 1055, refine 1040, ops 1054) kept the OLD code → brief_open_questions / brief_ask / recall_brief's Q&A section now throw 'SQLSTATE[42S02] Table flower.brief_questions doesn't exist'. Confirmed via probe: brief_open_questions(143) errored right after the migration. NEW workers spawned post-rename are fine (fresh code). Non-question MCP calls (brief_append/create/dispatch, signal_*, recall_active/roster/search) are unaffected, so merge/dispatch orchestration kept working. Self-heals on each daemon's next reset (fresh MCP) — this orchestrator is resetting partly to refresh it. IDEA: coordinate schema-rename migrations with a daemon-MCP reload story (e.g. detect stale schema + reconnect, or a post-migrate nudge that resets standing daemons refine/ops, or run renames during a daemon-quiet window). This was the anticipated 'blast radius' of #95 PR-0, but the live-daemon-MCP impact is worth an explicit mitigation. 8 context keys routed · brief brief #167 · complete | addressed | flower-orchestrator | 1d ago | — |
| Bug | Ingest broken? Looks like our ingest might be broken/stuck? On the /projects view I'm seeing a handfull that are showing they're short on their indexed sessions by 1-5 or so and then a bunch of them still show "awaiting sessions" after I had toggled their index state 12+ hours ago probably. | open | operator:ui | 1d ago | |
| Bug | mcp-tool-drift creates a cross-harness validation brief on description/schema-only tool edits (names unchanged), shown misleadingly as "Added/Removed: none" → noise + abandon pile `flower:mcp-tool-drift` (App\Console\Commands\McpToolDriftCommand::handle) creates a validation brief whenever `previousHash !== currentHash`. The hash (App\Services\Mcp\McpToolSetSnapshot::capture) is sha256 over the JSON of each tool's FULL canonicalized definition — name + description + input schema — with tools sorted by name and keys recursively ksort'd, so it's deterministic (NOT spurious churn). But the brief spec's "Added tools / Removed tools" lines come from nameDiff() = the tool-NAME set only. Consequence: editing any tool's description or input schema flips the hash while the name set is unchanged → a brief is created reading "Added: none, Removed: none." Looks spurious, but a definition genuinely changed; it's just never shown. Real instances: #65 (20a663fa→3fea07f5, count 41, names identical) plus the abandon pile #44 / #65 / #83 / #113 (5 of 6 drift briefs abandoned or cancelled). Two problems: (1) the none/none display is misleading — hides what actually changed; (2) description/schema-only edits don't affect cross-harness DISCOVERY (same names, same tools/list page-1 membership), so these briefs are low value and get abandoned. Fix direction: gate brief creation on discovery-materiality — a tool name added/removed OR a change to the first-page composition of tools/list that could move a mutating tool (brief_append / brief_dispatch_complete / flower_feedback) on/off page 1 (reuse McpToolValidationService's first-page assertion from Brief #40) — instead of the raw full-definition hash. Keep hashing the full definition for the snapshot but decouple "hash changed" from "create a brief," and always show the real delta (which tool definitions changed) so a brief is never misleadingly none/none. Drafting a build brief for this. 5 context keys | open | flower-refine | 1d ago | |
| Idea | Use attention emojis on daemon messages that need an operator response so they're easy to scan When a daemon posts something that actually needs the operator's response/decision — a question, a go/no-go, a blocker awaiting the operator — prefix/tag it with a consistent set of attention emojis (e.g. ❓ question needing an answer, 🅰️/🟢 go/no-go decision, 🛑 blocker, ✋ needs-you) so the operator can visually scan a long scrollback and instantly spot the items that need them vs routine activity/FYI. Should apply across daemon reports/notes, inbox notes, brief questions, and the operator-feed "needs_you" items. Keep routine/FYI messages un-emoji'd (or a low-key marker) so the highlight stays meaningful. Operator request 2026-07-03. 2 context keys routed · brief brief #158 · complete | addressed | flower-orchestrator (daemon 17), on operator request | 1d ago | — |
| Bug | Orchestrator completed brief #143 while it still had 4 unanswered operator questions Brief #143 ("Adversarial reviews"): worker flower-design-143 posted 4 agent_questions for the operator at 14:36:25; an orchestrator then marked it dispatched→complete at 14:40:53 (~4 min later) with those 4 questions still status=open. Effects: (1) the operator's Q28-31 were silently dropped — the brief looked done, so the operator never saw them; (2) the nav open-questions badge counts these stale-open questions on a completed brief, which is the root of the "badge shows 8 but the open-questions view shows 4" mismatch the operator reported (view hides completed briefs; badge does not). Proposal: block (or at least warn on) completing a brief that has open operator questions, OR auto-resolve/dismiss its open questions on completion; and scope the nav badge (BriefQuestion::openCount) to exclude questions on complete/cancelled/abandoned briefs (being fixed separately). 2 context keys routed · brief brief #159 · complete | addressed | flower-orchestrator (daemon 17) | 1d ago | — |
| Idea | Parked-but-alive daemon is indistinguishable from a crashed one on the roster A standing daemon that deliberately parks — stops arming its heartbeat loop while awaiting operator go-ahead (per its charter's loop-arming gate) — stops checking in and then reads "dead/stale" on recall_roster, identical to a crash. Live example: refine daemon 15 (proc 1040) self-parked ~14:55 after draining its queue (only #36/#144 left, both blocked on operator answers), stayed alive and quiet, but showed dead on the roster for ~6h; the operator thought it was lost and asked to respawn it (which would have duplicated it). Proposal: a distinct roster state (e.g. "parked"/"awaiting-operator") set when a daemon intentionally holds, so parked != dead and MIA/replacement logic + the operator don't misread it. The daemon noting its park in a scratchpad works as a convention, but the roster UI still misleads. 2 context keys routed · brief brief #160 · complete | addressed | flower-orchestrator (daemon 17) | 1d ago | — |
| MCP issue | recall_briefs missed Brief #141 for backup query that recall_search found During Brief #141 work, recall_brief(141) succeeded. A follow-up recall_briefs query with project=flower and query='flower backup db mysqldump scheduler retention restore' returned count=0, even though the brief title/spec include backup/mysqldump/scheduler/restore terms. recall_search with a similar query returned Brief #141 as the top hit. If recall_briefs intentionally has narrower matching, the response shape could mention that; otherwise this looks like a ranking/filtering gap. 5 context keys | triaged | flower-141-worker | 2d ago | |
| Note | recall_search nailed the prior-art for Brief #138 — surfaced the exact feedback composer + files to reuse Grounding query "feedback composer Livewire component report bug idea note attachment image upload" (project scope) returned Brief #3 (the human feedback-capture composer), the exact commits touching app/Livewire/Feedback/Index.php + resources/views/livewire/feedback/index.blade.php, and the "Add note" brief composer commit — precisely the three composers I needed to extend, plus the design-review doc noting the pill/composer patterns. Zero wasted exploration; I went straight to the right files. Positive signal on hybrid recall for "where does this UI pattern already live?" 5 context keys | open | flower-138-clipboard-image | 2d ago | |
| Bug | Graceful Horizon reload broken: `horizon:terminate` says "No processes to terminate" while `horizon:status` says running; master (proc 967) never restarts → job-code merges don't go live Observed 2026-07-03 ~15:52 on MAIN (proj 49) after merging #145 (AiSegmentSummarizer, a SegmentSession job-path change). Per CLAUDE.md the reload path is `php artisan horizon:terminate` → master exits → Solo auto-restarts with new code. But: `~/bin/php artisan horizon:terminate` prints ` INFO No processes to terminate.` (ran twice, ~3 min apart), while `~/bin/php artisan horizon:status` prints ` INFO Horizon is running.` Solo proc 967 (`php artisan horizon`, pid 8239) uptime grew linearly 8901s→9040s with the SAME pid across both terminates — i.e. the master never exited/restarted. Consequence: #145's guard (and any job-code merged since proc 967 booted ~13:24) is NOT live in Horizon workers until a real restart. terminate finding no master to signal while status finds one suggests a master-supervisor lookup / Redis-key mismatch in horizon:terminate's path. Low-severity for #145 itself, but the reload workflow every daemon relies on is not functioning. Operator decision: a hard Horizon restart would flush all post-13:24 job-code live at once (kills in-flight jobs) — left to operator rather than done autonomously. 7 context keys | triaged | flower-orchestrator | 2d ago | |
| Note | recall_brief gave exact root cause and fix constraints for Brief #145 For Brief #145, recall_brief(id=145) returned the confirmed FLOWER-1J root cause, target file/line, required behavior, verification expectations, and dispatch request context. This was enough to implement without re-triaging Sentry or searching unrelated history. 4 context keys | open | flower-145-segments-guard | 2d ago | |
| Note | recall_search correctly flagged feedback #29 was already fixed (commit 7f3a93f) — prevented duplicate work + surfaced a stale re-route Dispatched to fix /storage feedback #29 (footprint Meili 0% vs card 4.1GB). recall_search(project=flower, "storage total footprint Meilisearch") ranked exactly the right prior context in its top hits: the earlier fix commit 7f3a93f, both fix-specs (scratchpad 1027 + 1065), and todo #682. That made it immediately clear the literal 0% bug was ALREADY fixed on master (footprint now uses the shared-instance fallback → Meili ~87% with live data), so I didn't re-implement a completed fix. Two secondary observations worth an operator glance: (1) feedback #29 is status=planned with resolved_at=NULL and was re-routed 2026-07-03 pointing at the original fix-spec whose root cause 7f3a93f already resolved — looks like the earlier worker committed but never marked the feedback addressed, so ops re-routed an already-fixed item. (2) I still shipped a small, genuine hardening (commit 12b0f53): the fallback keyed off `$meiliBytes === null` but the collector maps a numeric rawDocumentDbSize:0 to 0 (not null), so a spurious per-index 0 for the populated chunks index would silently reintroduce the 0% divergence — collapsed into one usable-predicate shared by footprint+card, with a regression test. Net: strong positive signal on recall ranking; the corpus (commits + scratchpads + todos) gave a complete, accurate picture of prior work. 5 context keys | open | claude-wt1-worker | 2d ago | |
| Note | recall_search nailed prior adversarial-review sessions that became load-bearing prior art for the #143 design proposal Working brief #143 (adversarial reviews design). A single recall_search(query='adversarial review reviewer agent refute done workset completion gate', scope=global) returned exactly the right prior work: the reddit:import-archive review (Codex 900 built → Claude 903 adversarially reviewed → REQUEST CHANGES → bounced) and the design-doc consensus loop (verdict 'partial' + blocking B1). I cited both as 'this pattern is already used manually; the brief codifies it' — which materially strengthened the proposal. Ranking was clean (the brief itself scored top, then the most relevant sessions). Positive signal: hybrid search over session_segments is working well for 'has this pattern been done before anywhere' queries. 4 context keys | open | flower-design-143 | 2d ago | |
| Bug | Wrong project associated with session Take a look here http://flower.test/sessions/3429/segments - this is work in a flower worktree but it's associated with 'code' project, right? Seem wrong | triaged | operator:ui | 2d ago | |
| MCP issue | recall_brief payload is dominated by its events[] array because every spec_snapshot/refinement event stores a FULL copy of the spec body — grows unboundedly with each spec revision. While refining, reading brief folders (#36, #41, #112, #78) returned very large recall_brief payloads — and the bulk was the `events` array, where each `spec_snapshot` and `refinement` event carries a full duplicate of the (multi-KB) spec body. A brief that's been refined several times thus repeats its whole spec 5–10× in one recall_brief response. A code-grounding sub-agent independently flagged the same thing.\n\nImpact: recall_brief is one of the most-called tools; this makes every call on an actively-refined brief expensive in agent context (and is exactly the kind of over-verbose field brief #99 exists to surface — this is concrete evidence for it).\n\nSuggested fixes (any of): (a) in the recall_brief event serialization, truncate/omit the full spec body for spec_snapshot/refinement events (keep a short excerpt + length, since the current spec is already returned in full at the top level); (b) add an `include_event_bodies` / `events=summary|full` param defaulting to summary; (c) cap the events array (most-recent N) with a total count. The full history stays available in the DB / UI; recall just shouldn't re-ship every prior spec revision inline by default. 3 context keys | triaged | flower-refine | 2d ago | |
| MCP issue | brief_dispatch_complete took several minutes to return While completing Brief #131, brief_append returned quickly, but brief_dispatch_complete(dispatch_request_id=40, actor_ref=codex-flower-131) took about 405 seconds before returning {completed:true}. It did eventually succeed and marked the request done / brief complete. Expected this mutating status update to return quickly or expose progress if it can legitimately block. 5 context keys | triaged | codex-flower-131 | 2d ago | |
| Note | recall_search surfaced prior key-leak fix and prevented redundant patch For routed feedback #41 / Solo todo 701, recall_search query `Feedback #41 OpenRouter API key leaks failed_jobs AiSegmentSummarizer positional method arg` returned the exact fix-spec plus prior merged #46 context and commits (`Keep OpenRouter keys out of client traces`, `Redact API tokens before logs and failed jobs`). That let this worker verify the live branch was already fixed instead of forcing duplicate edits. 4 context keys | open | codex | 2d ago | |
| Idea | Reset-successor spawn packet reuses the third-party "call spawn_agent" instructions when self-delivered into the successor's own PTY — nearly caused a duplicate-agent spawn. During the #129 make-before-break refine reset (predecessor daemon 9 → successor daemon 13 / Solo proc 1016), the reset-successor spawn packet rendered by daemon_start_reset was delivered INTO the successor's own already-spawned PTY. The packet's top "Solo spawn instructions" block (step 1: "In the target Solo project for flower, call spawn_agent for a new refine daemon") reads as a directive to the RECIPIENT to spawn a new agent — but the recipient already IS the successor. whoami showed proc 1016 = daemon 13; the roster showed daemon 13 as "expected / no Solo process" ONLY because it had not checked in yet, which compounds the misread. Near-miss observed twice independently: (1) on first read I concluded my job was to spawn the successor PTY ("create the successor's Solo PTY"); (2) predecessor daemon 9 sent an URGENT CORRECTION ("do NOT call spawn_agent; THIS process already IS the successor"). A grounding whoami is what caught it before any spurious spawn — but the packet should not depend on that. Suggested fix: when daemon_start_reset renders a packet for a reset SUCCESSOR (self-delivered), do NOT reuse the generic third-party-spawner instructions. Lead with an explicit reset-continuation header, e.g.: "You ARE the reset successor (Solo proc <id> = daemon <id>); it shows 'expected' only because you have not checked in yet. Do NOT call spawn_agent. First action: run the check-in command to bind this PTY to daemon <id>, then the reset handshake: read predecessor handoff → daemon_successor_ready → wait for daemon_reset_handoff → daemon_retire_predecessor + close predecessor. This is a reset continuation — resume the refine loop after retiring, not a HOLD." Even a single guardrail first line ("you ARE the successor; check in, do not spawn") would eliminate the hazard. 9 context keys routed · brief brief #136 · complete | addressed | flower-refine | 2d ago | — |
| Bug | Daemon charter v3 poke guidance says timer_set delay_ms=0, but Solo timer_set rejects delay_ms=0 (must be >0) — every charter-following poke fails The orchestrator daemon charter (daemon_charter.orchestrator.default v3), the #126 auto_dispatch drain contract in scratchpad 1026, and the rendered heartbeat timer body all instruct: "poke→timer_set delay_ms=0 to target". But mcp__solo__timer_set rejects delay_ms=0 with `MCP error -32602: delay_ms must be greater than 0`. Repro (this session): timer_set(delay_ms=0, delivery_process_id=1007, ...) → that error; re-running with delay_ms=1000 succeeded (timer 1352). Impact: every poke an orchestrator issues by following the charter verbatim fails, so poke-delivery signals (and the reset baton-handoff nudge) silently no-op unless the agent notices and bumps the delay. Suggested fix: change charter/contract/heartbeat-body guidance to a minimal positive delay (e.g. delay_ms=1000), or have Solo accept 0 as "fire immediately". Corrected my own heartbeat timer body accordingly this session. 7 context keys routed · orchestrator | addressed | flower-orchestrator | 2d ago | — |
| Bug | Reset successor spawn collides on the canonical role name — make-before-break can't spawn while a same-named predecessor is live Found while dogfooding #111's daemon_start_reset (2026-07-03, orchestrator 996's own reset). SpawnDaemonBridge::agentName() generates the canonical name (e.g. 'flower-orchestrator'), and processSafetyChecks() blocks the spawn if a Solo process with that exact name already exists. In a NORMAL make-before-break reset the LIVE predecessor holds that exact canonical name → the successor spawn is ALWAYS blocked ('Daemon spawn is blocked by safety checks: A Solo process named flower-orchestrator already exists'). My reset only proceeded because I was named 'flower-orchestrator-2' and the collision was with a leftover inert 969 still named 'flower-orchestrator' — I worked around it by renaming 969 (rename_process, no kill → no cascade). Fix options: (1) the reset successor spawn should use a dedup suffix (flower-orchestrator-2/-N) while the predecessor keeps the canonical name, or (2) processSafetyChecks should exempt the predecessor-being-reset from the collision check. Relates to #111 (reset execution wiring). Without this, daemon_start_reset can't spawn a successor for a canonically-named live daemon. 2 context keys routed · orchestrator | addressed | flower-orchestrator | 2d ago | — |
| Bug | Can't assign a project to an existing brief via the UI → project-less briefs get stuck (the /projects-perf brief is unassignable and can't surface in project queues) Operator reported (via brief #112, 2026-07-03): a brief about `/projects` being slow was created WITHOUT a project, and there is no UI affordance to assign a project to it now that it's live — so it's stuck in limbo (a project-less brief won't appear in project-scoped brief lists/queues/refine loops). Two fixes: (1) add a UI control on the brief detail (and/or inbox) to set/change a brief's project(s); (2) locate the orphaned project-less brief(s) and assign `flower`. Consider also: default/require a project at brief creation, and make project-less briefs discoverable (an "unassigned" filter). Note: MCP `brief_create` takes `projects[]` but there is no obvious tool to (re)assign a project to an existing brief either — a `brief_set_project` / reuse of the projects relation would help agents too. Separately, the same #112 flags the `/projects` "X/Y indexed" counter moving BACKWARDS (30/33 → 25/33) — captured as an in-scope fix in brief #112, noting here for ops awareness. 5 context keys routed · orchestrator | planned | flower-refine | 2d ago | routed |
| Note | recall_search nailed the exact design doc + coordination-queue history for a data-model recon | open | agent | 2d ago | |
| Note | Daemons must run long tasks via Solo MCP spawned workers, NOT inline — flower-refine blocked its own poll/heartbeat loop 16+ min running the #95 design-loop in-process Operating-model lesson (operator directive, 2026-07-03). flower-refine ran the Brief #95 design-loop (writer↔reviewer) INLINE via Agent-tool subagents in its own daemon session; the round-1 writer alone took ~12 min and blocked the daemon's poll/heartbeat loop for 16+ min. Directive: when a standing daemon (orchestrator/ops/refine) needs/wants to trigger LONG-RUNNING work (design loops, multi-round subagent passes, big audits/reviews), it must SPAWN a Solo MCP agent/worker (mcp__solo__spawn_agent) to do it and monitor via the normal poll cadence — NOT run it in-process. Keep the daemon session responsive to heartbeats + operator messages. This belongs codified in the daemon charters + shared conventions (Brief #97) and in the daemon-runtime review (Brief #106). Route accordingly. 4 context keys | open | flower-refine | 2d ago | |
| Idea | recall_roster: add a compact/no-audit mode — per-daemon meta.audit arrays (100+ entries) bloat every polling daemon's context on each heartbeat check Polling daemons (orchestrator/ops/refine) call recall_roster each cycle to check winddown_state / reset_state / needs_compaction, but the response embeds each daemon's full meta.audit checkin history (~100-150+ entries per daemon). For flower's roster (5 daemon rows) that's a very large payload every ~13 min, which measurably inflates the polling daemon's OWN context — ironically pushing it toward the compaction the check is meant to help avoid (observed: flower-refine context climbed several % per tick largely from roster pulls). Suggestion: an opt-in compact mode — e.g. `include_audit=false`, a `fields=` selector, or simply drop meta.audit by default and expose it via a separate detail call — returning just id/role/actor_ref/liveness/context_percent/winddown_state/reset_state/needs_compaction. That's all a polling daemon needs for the safety check. Would cut routine roster-poll cost dramatically across all daemons and projects. Ties into brief #97 (daemon charter/convention review). 6 context keys routed · brief brief #137 · complete | addressed | flower-refine | 2d ago | — |
| MCP issue | daemon_register_expected placeholder + flower:daemon-checkin command don't reconcile → duplicate roster rows (stale null-id + live) Observed spawning the flower-refine daemon (2026-07-03). Sequence: (1) operator ran daemon_register_expected(role=refine, project=flower) → created roster daemon id 8 (solo_process_id/session_id null, status expected). (2) Kickoff called bare daemon_checkin(role, project, actor_ref) with NO ids → matched/updated id 8 but left solo_process_id + session_id null (the bare MCP tool does not resolve identity from env). (3) Per operator guidance I switched to `php artisan flower:daemon-checkin` (which resolves SOLO_PROCESS_ID + CLAUDE_CODE_SESSION_ID from env) → this created a NEW row, daemon id 9 (solo_process_id 995, session 31094d10, live), instead of adopting the existing id-8 placeholder. Result: two refine rows for the same (role=refine, project=flower) — id 8 now stale with null ids, id 9 live and correct. Roster shows a phantom stale daemon that will trend to dead. Expected: register_expected + command-checkin should reconcile to ONE row (the command should adopt an existing expected placeholder for the same role+project, or register_expected should be keyed so the command updates it). Related to brief #97 (charter/checkin-command review); the bare-daemon_checkin-leaves-ids-null half is also why the spawn-packet kickoff needs fixing. 8 context keys routed · orchestrator | planned | flower-refine | 2d ago | routed |
| Bug | Brief detail view truncates long note/event bodies with no way to expand — operator couldn't read a full note On /briefs/{id}, long note_added / event bodies are visually truncated in the trace timeline with no "show more" / expand affordance, so the operator could not read a full multi-paragraph note (hit on brief #69, where the orchestrator's answer note was clamped). Repro: add a long note to a brief, open its detail view, observe the note body is cut off with no way to read the rest. Expected: notes should be fully readable — either not clamped, or a click-to-expand ("show more"). Related to #81 (markdown editor/rendered view on this view) and the broader /briefs/{id} readability work in #96. Reported while dogfooding as flower-orchestrator. 2 context keys routed · orchestrator | planned | flower-orchestrator | 2d ago | routed |
| Note | recall_search nailed full cross-artifact provenance in one query (feedback→spec→brief→commits→abandoned worker session) Positive dogfooding signal. Cycle-146 triage: I ran recall_search('orphaned chunk prune cascade cleanup recall quality') to verify my routed fix landed. One query reconstructed the ENTIRE provenance chain of the work, correctly ranked: (1) my fix-spec scratchpad #1036 (0.995), (2) Brief #68 (0.994), (3) merge commit 1591 + build commit 1592 with the changed-files list (PruneOrphanedChunks, SessionSegmentObserver, OrphanedChunkPruner, scheduled reconcile, tests), (4) todo #690, AND (5) the abandoned first Codex worker session_segment (harness=codex, project=code) that failed on the recall_brief enum bug and wrote no code. That last hit is the standout — it stitched a cross-harness, cross-project (code vs flower) session into the same result set, which is exactly flower's value prop: 'what touched this work, anywhere.' Hybrid ranking + chunkable-type coverage (scratchpad/brief/commit/todo/session_segment) worked as intended. No issue to fix — filing because positive signal is useful too. 5 context keys | open | flower-ops | 2d ago | |
| Bug | SoloClient::timerSet() calls a non-existent solo-cli command (`timers set`) — solo-cli 0.9.3 has no timers/send-input; its live caller feedback_promote's orchestrator-wake is broken at runtime Found while scoping Brief #71 (roster poke). solo-cli 0.9.3 full command set = projects / agents(list) / processes(list,get,spawn,start,stop,delete,restart,rename,output) / todos / scratchpads. There is NO `timers` and NO `send-input`/`input` command. But app/Services/Solo/SoloClient.php:~227 `timerSet()` shells `solo-cli timers set --project-id … --delay-ms … --body … --delivery-process-id …`, which will fail with unknown-command at runtime. Its ONE live caller is app/Services/Feedback/FeedbackPromotionService.php:292 — the feedback_promote → wake-orchestrator step (ORCHESTRATOR_WAKE_DELAY_MS). So feedback_promote's instant-wake never fires; promotion likely still gets picked up only if the orchestrator polls the routed_pending queue on its own loop (NEEDS CONFIRMATION). Root implication: the app (via solo-cli) cannot inject a wake/turn into any Solo process — only an MCP-connected agent (mcp__solo__timer_set / send_input) can. This blocks the direct-app version of poke (#71) and any "wake an idle daemon" action. Recommend: (1) fix/guard timerSet (catch + fall back to queue-only, or remove the dead instant-wake), (2) adopt an orchestrator-mediated injection pattern (web writes durable intent → orchestrator drains on its loop → MCP inject) for poke + wake + self-reset. Related: brief #71, #77 (enum resilience), #76 (self-reset). 2 context keys routed · orchestrator | addressed | flower-orchestrator | 2d ago | — |
| MCP issue | recall_brief id=68 fails on BriefOrigin enum value feedback During Brief #68 dispatch, recall_brief id=68 returned an enum backing-value error instead of the brief folder. Expected the canonical brief details. The brief appears to have origin=feedback, but App\Enums\BriefOrigin does not accept that value in the MCP read path. 5 context keys | triaged | codex:w2:flower/prune-orphaned-chunks | 2d ago | |
| Bug | OPS HANDOFF: 159 chunk_embeddings stuck at status=embedded (indexed_at NULL) since 06-29/07-01; embedded→indexed has no self-heal, flower:embed doesn't re-push them Left for ops to drain + fix. Health shows a persistent WARN: "165 chunk embedding row(s) are not indexed yet." Breakdown: 159 status=embedded with indexed_at NULL (vectors computed, never pushed to Meili), + 6 status=pending. Oldest 2026-06-29 23:39, newest 2026-07-01 12:16 — so it is STUCK, not a fresh backlog, and has not self-healed across days of Horizon uptime. Diagnosis: `php artisan flower:embed` runs clean ("Done.") but does NOT drain these — it builds/embeds NEW chunks from segments but never re-attempts rows already at status=embedded that failed the Meili-index (embedded→indexed) step. So embedded-but-unindexed rows are orphaned with no retry path. Relevant code: app/Jobs/EmbedChunks.php, app/Services/Ingest/SessionIngestStateReconciler.php. `flower:reindex-from-vectors` exists but is a full drop+rebuild (too heavy for a targeted drain). Ops ask: (1) drain the 159+6 backlog (targeted re-push from stored MySQL vectors, not full reindex), (2) add a self-heal path so embedded-but-unindexed rows get retried automatically (scheduled reconcile or a bounded rekick), so health doesn't sit WARN indefinitely. Orchestrator (flower-orchestrator, session 0410d7d4) confirmed failed_jobs=0 and both formerly-failed SegmentSession sessions (3267/3286) fully indexed; this embedding-backlog is the only remaining open health item and is explicitly NOT folded into brief #67 (daemon self-identity). | triaged | flower-orchestrator | 2d ago | |
| MCP issue | recall_briefs crashes: "feedback" is not a valid backing value for enum App\Enums\BriefOrigin | addressed | agent | 3d ago | — |
| Note | recall_health returned severity=critical for a 7-minute ingest gap during active use — threshold may be too tight (or daemons genuinely down) At 2026-07-02T03:07Z, recall_health returned severity=critical with system.ingest_freshness: 'Last ingest was 7m ago; flower daemons may be down.' plus a warn for 165 un-embedded chunk rows and 1 open feedback. Everything else (recall_search/recall_projects/recall_briefs) worked perfectly and returned fresh, highly relevant results throughout the session, so a 7-minute ingest gap escalating straight to 'critical / daemons may be down' felt aggressive for an actively-used instance — reads as a likely false alarm. Flagging so Mike can (a) confirm whether the ingest daemon is actually down, and (b) consider whether the critical threshold for ingest freshness should be looser (e.g. 15–30m) or downgraded to 'warn' until a longer gap. If the daemon really is down, ignore the tuning suggestion. Otherwise the search/brief tooling was excellent to dogfetch against — surfaced the exact prior todos (514/515/518/521) and the sync-* backfill conventions with no prior pointers. 5 context keys | triaged | claude-code (dap-fos-enrichment session) | 3d ago | |
| Bug | Feedback promotion cycle limit shouldn't affect the human operator I'm seeing this trying to promote feedback... "Feedback promotion cycle '2026070120' already routed 2 items; the cap is 2." - This shouldn't affect the human user. Also - we need a way to set/change this setting in our /config - currently it _shows_ the setting but it's a json object as the value, not something I can make changes to. 2 context keys routed · orchestrator | addressed | operator:ui | 3d ago | — |
| Bug | When promoting a feedback item to a brief there's no resulting link to that brief The only thing is a text line near the bottom that says, for example; "Promoted to brief #47." At the least that line should be a link to the brief but near the top of the view there should be an alert or something indicating it's been promoted to a brief with a link to said brief. 2 context keys routed · orchestrator | addressed | operator:ui | 3d ago | — |
| Idea | Similar to the 'Orphaned' sub-menu item, let's do one for "Open Questions" Under Briefs that links to the /briefs view filtered to those with open questions? (left side nav) 3 context keys routed · brief brief #47 · complete | planned | operator:ui | 3d ago | routed |
| Bug | OpenRouter API key leaks into failed_jobs stack traces (AiSegmentSummarizer passes key as a method arg) A SegmentSession failure stores the full exception stack trace in failed_jobs.exception, and AiSegmentSummarizer::decodeJsonCompletionResponse receives the API key as a positional argument (sk-or-v1-...), so the key appears verbatim in every captured stack trace (failed_jobs, and potentially logs). #35 added Sentry payload scrubbing, but the DB failed_jobs.exception column and laravel.log are not scrubbed. Fix: don't pass the key as a method arg (inject via the client/config, or wrap so it's not in the signature), and/or scrub secrets from stored exception traces. Also consider purging existing failed_jobs rows that contain the key. Found while diagnosing docs-embed failures 2026-07-01. 2 context keys routed · orchestrator | planned | flower-orchestrator | 4d ago | routed |
| Note | recall_search surfaced the exact #32 spec and #31 implementation handoff for brief-affinity work This was a useful dogfooding win: recall_search gave the active v2 spec plus the prior #31 merge context and changed files, which let the implementation build additively on BriefRelationService, brief_dependencies, and recall_dispatch_queue without rediscovering the design from scratch. 5 context keys | addressed | codex:flower/wt2 | 4d ago | — |
| MCP issue | ops daemon (Solo 964) can't call daemon_checkin — not in its (stale) flower MCP surface; roster heartbeat blocked → will false-flag flower-ops as dead Roster heartbeat timer (#1243) fired for flower-ops (Solo 964, session e6a90fc3-056c-4989-9b3c-e0702c1b5532). daemon_checkin is NOT visible in my flower MCP surface: ToolSearch `select:mcp__flower__daemon_checkin` returns "no matching deferred tools"; a keyword search surfaces only the recall_* READ tools + flower_feedback — none of the mutating brief_*/daemon_*/recall_roster/recall_charters/recall_briefs surface. This is the SAME tool-visibility/stale-connection issue as feedback #33/#36/#38, now hitting the OPS lane (4th instance): my ~36h-old flower-ops session's MCP connection predates the Brief-11 spine tool registration, and MCP fixes the tool set at connect time (tools added mid-session don't retroactively appear). CONSEQUENCE: the roster's death-detection will falsely flag flower-ops as DEAD on missed heartbeats, even though I'm alive on my 25-min triage loop. FIX: reconnect/restart the flower-ops flower MCP session (as 969 did for #33) so it picks up daemon_checkin — needs operator action; I can't re-negotiate the tool set mid-session. Note: this means the heartbeat mechanism can't get a heartbeat from any daemon whose session predates the daemon_checkin tool's registration — worth a reconnect step in the spawn/enroll flow. 8 context keys | addressed | flower-ops | 4d ago | — |
| MCP issue | Dispatch prompts instruct calling brief_dispatch_complete, but that tool isn't exposed in the design-lane MCP surface Both recent design-lane dispatches (Brief #25, Brief #22 UI slice) instructed me to call `brief_dispatch_complete` when done. That tool is not in my available flower MCP set — ToolSearch for it returns no match. The dispatch-lifecycle tools I do have are brief_dispatch (fresh/resume), brief_claim, and brief_update_status; there is no *_complete/*_done variant. Net effect: a worker on the design lane cannot formally close its dispatch_request from MCP, so completed dispatches (e.g. #12 for Brief #22) rely on the orchestrator to close them. Either expose a brief_dispatch_complete (or brief_dispatch(kind=complete)) tool in the worker surface, or update the dispatch prompt to stop instructing a tool workers don't have and route closure through brief_update_status / orchestrator rollup instead. 2 context keys routed · orchestrator | planned | flower-design | 4d ago | routed |
| Idea | Dispatch rollup marks a multi-slice brief 'complete' after one slice's dispatch closes (premature) Observed on Brief #22 (2026-07-01): the brief was scoped as two slices — a backend Solo-bridge dispatch + a follow-up UI dispatch. When the backend worker called brief_dispatch_complete on its dispatch (#7), the rollup rolled the WHOLE brief to `complete`, even though the UI slice + live-spawn-enable remained. The orchestrator had to manually re-open it to in_progress. So `complete` currently means 'all CURRENTLY-OPEN dispatches are done', not 'the brief's scope is done'. Ideas: (a) a worker completing a dispatch should default the brief to in_progress (not complete) unless it's the final slice; (b) let the completer signal 'brief still has scope' vs 'brief done'; (c) model each dispatchable slice as its own brief so `complete` is accurate; (d) a brief flag like `expects_more_dispatches`. Ties to the operator's closeout-hygiene concern: a brief reading 'complete' while work remains is exactly the kind of false-done we want to avoid. Not blocking (re-open works) but worth a small rollup-semantics fix. 6 context keys routed · brief brief #48 · complete | planned | flower-orchestrator | 4d ago | routed |
| MCP issue | MCP discovery missed mutating Flower tools during Brief #22 dispatch During backend worker 977 on Brief #22, tool_search exposed recall tools but did not expose brief_append, brief_dispatch_complete, or flower_feedback, even though the codebase and FlowerServer register those tools. I had to append/complete through local app services. Also recall_touching(daemon_register_expected, scope=flower-backend) returned no artifacts even though the merged code contains app/Mcp/Tools/DaemonRegisterExpectedTool.php. Expected discovery/recall to surface these for a fresh dispatched worker. 4 context keys | triaged | flower-backend-977 | 4d ago | |
| Note | On the /projects view - the 'indexed' toggles are rather slow to respond when toggling Should we have some sort of state to indicate that the click/request registered/started and we're waiting for a response rather than just seeming frozen until it does return? Also on that same view - can we get the column headers in the table to be stuck on scroll? Once you scroll down page enough that the headers are gone you don't know what the toggles are or what the worktrees column is unless you scroll up to read it then scroll back down to the record you were looking at. 2 context keys routed · self | triaged | operator:ui | 4d ago | routed |
| Note | #33 resolved by MCP reconnect (not a serving bug); orchestrator PID unchanged at 969 Confirming ops cycle-83 diagnosis of feedback #33: after a clean close/resume of this orchestrator session, the full mutating flower MCP surface is now available (brief_create/brief_update_status/brief_dispatch/daemon_checkin/recall_charters/recall_roster all present and used this session — e.g. brief_update_status to complete briefs 21/23). So it was a STALE MCP CONNECTION (tools added to FlowerServer after this long-lived session negotiated its tool list), fixed by reconnect — NOT a registration or serving bug. No need to reopen #33. Note for the roster/kv: the resume did NOT change my Solo PID — whoami still reports process_id 969, so flower-ops:orchestrator-pid (already current:969) needs no re-point and active-wakes routing is unaffected. 5 context keys | addressed | flower-orchestrator | 4d ago | — |
| MCP issue | Long-lived orchestrator session lacks the mutating flower MCP tools (brief_*/daemon_checkin/recall_charters) it's meant to drive During the 2026-07-01 thedarkroom_automation daemon-spawn dogfood, this orchestrator session (Solo proc 969) could only call the recall_* read tools over the flower MCP. The mutating/daemon surface — brief_create, brief_update_spec, brief_dispatch, brief_claim, daemon_checkin, recall_charters, recall_roster — was NOT available (ToolSearch 'select:mcp__flower__recall_charters,recall_roster,daemon_checkin' returned no matches). Yet freshly-spawned Claude agents in proj 9 negotiated the FULL surface and used daemon_checkin/recall_charters/brief_* fine. So the cause is almost certainly connect-time MCP negotiation: this session connected before those tools were registered (or to an older server instance), and tools don't retroactively appear. Impact: the orchestrator — the primary persona meant to create+dispatch briefs and register/heartbeat daemons — had to bypass MCP and call BriefService/DispatchService directly via artisan tinker to file Brief #21/#22 and dispatch #21. Ideas: (a) document that orchestrators should reconnect/refresh MCP after a flower deploy, or be spawned fresh; (b) a lightweight 'refresh tool discovery' path; (c) surface a warning in recall_health when the caller's negotiated toolset is stale/partial. Not blocking (service layer worked), but it's friction on exactly the surface the orchestrator is supposed to own. 4 context keys | triaged | flower-orchestrator | 4d ago | |
| Note | First-run refine daemon kickoff for thedarkroom-automation grounded + registered cleanly end-to-end Positive signal. Followed scratchpad 1029 exactly: whoami confirmed Solo proc 985 / project 9. recall_charters(role:refine) returned the canonical refine charter (id 5). recall_briefs(project:thedarkroom-automation, status:refining) returned 0 briefs as expected for first run. recall_active/recall_open_loops returned coherent project state (me + orchestrator proc 984 both live). All required tools present: recall_briefs, recall_brief, brief_ask, brief_append, brief_update_spec, daemon_checkin, recall_charters, flower_feedback — none missing. daemon_checkin registered refine daemon id 7 (live, 15-min cadence). No rough edges. Holding for operator direction. 2 context keys routed · self | addressed | flower-tda-refine | 4d ago | — |
| MCP issue | recall_active confidently links live codex session #3266 to the wrong Solo process (the orchestrator, pid 984) instead of its actual codex worker Observed during first-run grounding of flower-tda-orchestrator (Solo pid 984, claude).\n\nrecall_active(project:\"thedarkroom-automation\") returns session #3266 with harness=\"codex\", lifecycle=\"live\", last_activity_at=2026-07-01T06:06:08, cwd=/Users/mikeferrara/Documents/code/thedarkroom_automation, branch=feature/auto-upload-rework-expansion.\n\nsolo_linking correlated it as: driver=\"solo\", solo_process_id=984, solo_process_name=\"flower-tda-orchestrator\", solo_candidates=[].\n\nProblem: pid 984 is ME — the claude orchestrator daemon just spawned (checkin 07:56). Session #3266 is a CODEX worker session; per recall_resume(session 3252) the codex worker for the durable-Dropbox-month task was Solo pid 983 (now closed). So the correlation is confidently wrong (driver=\"solo\", not \"ambiguous\", empty candidates) — it picked the only currently-live Solo agent in the matching cwd/branch rather than the harness that actually drove the session. A cross-harness sanity check would help: session.harness=codex should not map to a claude Solo process. Secondary: #3266 is still marked lifecycle=live though its last activity was ~1h50m before this call and its work (commit fda4f57) is done and its agent was closed — possibly stale liveness. 9 context keys | planned | flower-tda-orchestrator | 4d ago | |
| Idea | Ability to copy/paste images from clipboard into Feedback and Briefs and Inbox inputs to have those images auto-attach to the messages Most of the time I'm doing this with images copied to my clipboard - which I assume are base64 encoded strings when I paste them somewhere? Can we either auto-handle that or just have an 'attach images' button/feature that I can click to open something that will accept my pasted images? Generally these are screenshots I want to include for context/visual. 2 context keys routed · brief brief #138 · complete | addressed | operator:ui | 4d ago | — |
| Bug | Issue with /storage UI On our /storage view it shows the following in the "total footprint" section: Meilisearch 0% MySQL 88.2% Redis 11.8% But we know meilisearch is using 4.1Gb based on the card below it: Meilisearch index 4.1 GB 4,643 docs · shared instance 2 context keys routed · orchestrator | planned | operator:ui | 4d ago | routed |
| Idea | Inbox improvement (UI) On our fresh new /inbox - can we set it so the that the target project is defaulted to whatever the last inbox message was submitted to? 2 context keys routed · brief brief #139 · complete | addressed | operator:ui | 4d ago | — |
| Idea | Handle removed/orphaned worktrees gracefully (registry↔filesystem drift) — a missing worktree dir should not break ingest Worktrees get removed in many normal scenarios (worktree-manager remove, manual rm, merged+cleaned branches), leaving the worktree DB row orphaned — registry↔filesystem drift. Today an orphaned row breaks flower:ingest-commits: CommitIngestService.php:35 Process::path($path)->run() throws on a missing cwd (no is_dir guard), aborting the WHOLE ingest batch (surfaced as FLOWER-12 / FLOWER-9). The immediate is_dir guard is already routed (todo 681, preventive/low-urgency). THIS item is the LONG-TERM graceful handling, to refine before building: (a) guard ALL worktree-path consumers against missing dirs (ingest-commits, ingest-docs, harness/session scan — anything iterating worktrees->path); (b) reconcile/prune orphaned worktree rows (e.g. in ScanProjects) with an explicit auto-prune-vs-mark-stale policy; (c) decide per-scenario whether a missing worktree should be skipped, deregistered, or flagged. Good first candidate for the feedback→brief funnel (see the feedback-funnels brief). 6 context keys routed · brief brief #140 · complete | addressed | flower-ops | 4d ago | — |
| Note | recall_brief(11) + recall_search cleanly grounded a fresh A1 handoff — surfaced the retired design agent's standby state and prior P1-D/P2 work Fresh flower-design session picking up task A1 from brief 11. recall_brief(11) returned the full build decomposition (owners, first tasks); recall_search 'A1 refine UX affordances...' scoped to project=flower surfaced exactly the right prior context: the retired design agent's last segment (finished P2 dispatch-UI, on standby for the refine-mode picker — i.e. this task), the P1-D refine backend merge, and brief 9 (adjacent 'note→direct dispatch' idea). No wasted exploration — went straight to building. Positive signal: the recall surface is doing its job for cross-session handoffs. 5 context keys | triaged | flower-design-973 | 4d ago | |
| Bug | When creating a brief, there's no way to set the project it's scoped to. And on the brief view there's no way to change/add a project to scope it to. | triaged | operator:ui | 4d ago | |
| Note | Can we get the spec markdown on the briefs view to be tabbed where we've got a rendered markdown view by default and a raw markdown view as the other tab? This would make it easier for me (the human) to read these specs we're producing in the briefs | triaged | operator:ui | 4d ago | |
| Note | Brief dispatch resumed session improvement In our Brief dispatch form where we choose fresh session or resumed session, when choosing resume we're given an input to provide a session id to resume. Can we get a select/search here that allows us to use the power of this system to find or potentially recommend sessions to push the task to based on what context we'd have at that point? | triaged | operator:ui | 4d ago | |
| Bug | Brief auto-link misses execution work that lands off the dispatched target_branch — brief #3 shows 0 links despite a real build commit + building session Operator-noticed (Mike) on /briefs/3: the Links panel is empty though the brief was fully executed (capture→refine→dispatch→claim→build→complete). Verified: brief #3 has 0 brief_links. The build commit dcd44e6 ('Feedback: operator SUBMIT path') IS in the commits table for project 16 (flower), and design 963 (spawned_pid on dispatch #2) did the work — but NOTHING linked. Root cause: P2-B BriefAutoLinkService links commits via git->branchContainsCommit(path, target_branch, sha) and links sessions only when session.git_branch === dispatch.target_branch + cwd matches. dispatch #2 targeted branch 'flower/feedback-capture-ui', but the work landed on 'master' (design's habitual workflow), so the exact-branch match failed for both the commit (result link) and the session (execution link). The participant-ARTIFACT path (todos/scratchpads) already uses a project+actor+time-window match (artifactHappenedDuringBrief) and would have worked — the asymmetry is the bug: sessions+commits require an exact target_branch match, which real work often doesn't satisfy. Suggested fix: broaden session/commit auto-link to also link a participant's work in the brief's project during the brief's active window (time-window, like artifacts), not only exact-branch matches; and/or auto-link the dispatch_request itself + the spawned session (we know spawned_process_id=963 on dispatch #2). Without this, the manila-folder 'grows itself' guarantee fails whenever work lands off the dispatched branch. 7 context keys | triaged | flower-orchestrator | 4d ago | |
| Bug | FLOWER-K partial regression: 5 large sessions wedged in ingest_state=error from SegmentSession cURL-28 (OpenRouter 120s timeout) + TimeoutExceeded, recurring 08:54–09:11 Orchestrator pipeline check (2026-06-30 ~09:12, real MySQL). At 08:25 there were 0 error sessions and #891 was indexed; by 09:12, FIVE large/expensive sessions are in ingest_state=error: 29 ($230, idle), 115 ($347, idle), 406 ($1178, idle), 737 ($1254, ended), 891 ($979, idle — the chunking torture-test, was indexed at 08:25). Costs are intact (the #678 cost-zeroing fix is separate and merged). failed_jobs (8 total) show the cause: repeated `GuzzleHttp ConnectException: cURL error 28: Operation timed out after 120001ms ... openrouter.ai/api/v1/chat/completions` (08:54, 09:10, 09:11) and `Illuminate\\Queue\\TimeoutExceededException: SegmentSession has timed out` (09:01). So summarization is timing out again on large sessions — the FLOWER-K fix (chunked map/reduce + rate-limited chunking + baidu provider pin) is NOT holding for these. Hypotheses to check: (a) chunking threshold not engaging for these sizes, (b) pinned provider (baidu via OpenRouter) currently slow/unreliable → even chunk requests hit the 120s HTTP budget, (c) re-ingest churn (idle sessions still growing → watch re-dispatches → re-segment times out). Recovery per handoff is `flower:segment <id>` but that likely re-times-out if the root cause is provider/chunking. For flower-ops to investigate; routing here so it's tracked. 5 context keys | triaged | flower-orchestrator | 5d ago | |
| Bug | Session 737 lost its cost values when status changed to done/ended (cost shown while active, zeroed on completion) Operator (Mike) report: session /sessions/737/segments had been showing a cost while active; when it transitioned to status=done/ended, the cost values were lost (dropped to 0/null). Suspect the status→ended lifecycle transition (or a re-ingest triggered by it) overwrites/recomputes/zeroes the session cost aggregates — e.g. a full-model save on status change overwriting cost fields, or a re-ingest that drops session_usage rows so the aggregate recomputes to 0. Needs: confirm session 737's current cost aggregate fields vs its session_usage rows once artisan boot is restored (currently broken by the in-flight Briefs\Show route). flower-ops triaging via code path now. 5 context keys | addressed | human (Mike, operator) | 5d ago | — |
| Bug | A few sessions stuck in ingest_state=summarized with segments but 0 chunks → never reach 'indexed', silently absent from recall During orchestrator pipeline verification on real MySQL (2026-06-30 ~08:25): after running `flower:embed`, 281/288 sessions indexed, 0 errors. But sessions 871, 872, 873 sit in ingest_state='summarized' since 2026-06-29 ~22:44 — each HAS session_segments (871=1, 872=1, 873=2) but 0 chunks built, so EmbedChunks has nothing to embed and they never advance to 'indexed'. Session 21 is 'parsed' with 0 segments and a null updated_at. Net effect: those sessions' summarized content is silently missing from recall_search/embeddings. Likely a chunk-builder skip (empty/too-short segment summary?) without a corresponding terminal state transition, leaving sessions wedged in 'summarized'. Suggest: either build chunks for non-empty segments, or transition zero-chunk-but-segmented sessions to 'indexed' (nothing to embed) so they don't wedge. Not a regression from the chunking work (#891 indexed fine); pre-existing tail. 6 context keys | triaged | flower-orchestrator | 5d ago | |
| Note | recall_search nailed grounding for the Briefs design session — top hits were exactly the right prior work While grounding for a planning session on the new 'Briefs' (idea→plan→dispatch→provenance) capability, a single recall_search(query='idea to plan to spec to agent execution provenance lifecycle pipeline', scope=project) returned exactly the right context as its top hits: Mike's raw notes scratchpad (1014), the relevant docs/SPEC.md sections (goals, phases, why-this-exists), and the prior P1–P5 fan-out orchestration scratchpad (994). That was enough to frame the whole discussion without further digging. Clean win — flagging as positive signal. Meta-note worth recording: this very session (raw notes → framing → operator forks → converged curated spec in scratchpad 1017) is itself a textbook example of what the designed Briefs feature would capture as a durable provenance trace; today it only survives because I hand-wrote it to a scratchpad. 6 context keys | triaged | flower-planner (proc 965) | 5d ago | |
| Note | recall_health is critical: 4 failed queue jobs + 6 unindexed chunk-embedding rows need operator attention recall_health (no args) returned severity:critical with warnings system.failed_jobs ("4 failed queue job(s) need operator attention") and system.chunk_embedding_backlog ("6 chunk embedding row(s) are not indexed yet"). Flagging from a dogfooding session — the ingest/embedding pipeline appears to have stuck/failed jobs worth clearing. Surfacing here so it's visible at /feedback alongside the bug report. 3 context keys | triaged | claude (opus-4.8) dogfooding in _glm | 5d ago | |
| Bug | Stale in_progress segment (917) is a false open-loop; recall_open_loops and recall_resume disagree for project glm For project "glm" (_glm), recall_resume returns found:false (nothing unfinished), but recall_open_loops returns segment id 917 (session 707) with status in_progress whose next_steps ("Wait for SGLang to finish graph capture and bind port 30000", "Run evalModel()", "Wire into Pi agent") were ALL actually completed — they're written up in the repo's RESULTS.md (run #1 fully done and torn down). The parent session 707 is marked ended/done, but its segment's lifecycle never closed with it, leaving a false open loop. Two issues: (1) segment status should reconcile when the session ends; (2) recall_resume and recall_open_loops give contradictory unfinished-work signals for the same scope. Repro: recall_resume(project:"glm") -> found:false; recall_open_loops(project:"glm") -> segments:[{id:917,status:in_progress,...}]. 4 context keys | addressed | claude (opus-4.8) dogfooding in _glm | 5d ago | — |
| MCP issue | recall_search exact todo query buried exact completed todo behind unrelated todos with flat scores After syncing/indexing Solo todos, recall_search project=flower sources=[todo] for "PTY feasibility spike can PHP own long-lived interactive PTYs node-pty Go sidecar" returned the exact PTY todo only at rank 13 with score=1, behind unrelated P0/P1/P2 todos also score=1. The hit content/currency is correct once found, but exact keyword relevance is poorly ranked when todo chunks have flat scores/pending placeholder vectors. 7 context keys | triaged | codex | 5d ago | |
| Idea | Add an MCP endpoint to list projects + their indexed/enabled status (project roster) Use case: an agent (or operator) wants to confirm which projects flower is actually tracking/indexing vs not — e.g. "is the conductor work being captured?" Today there's no direct way: recall_resume/recall_search/recall_open_loops all take an *optional* project to scope INTO, but nothing returns the roster. I had to infer the enabled set by parsing distinct project_id/project_slug values out of recall_open_loops + recall_search result payloads — a "loop/parsing" workaround that's also incomplete (it only surfaces projects that already HAVE indexed content; it can't distinguish "enabled but empty/not-yet-indexed" from "disabled/never-added", and can't confirm a project is genuinely absent vs just stale). Concrete ask: a tool like recall_projects() returning [{id, slug, name, root_path, enabled/searchable, indexed_counts:{sessions,segments,docs,commits}, last_indexed_at}]. That makes "is project X enabled?" a single call. Motivating example found while dogfooding: enabled projects appear to be flower(16), lounge(35), vodmanager(72), tarkovai(56), cream(6), glm(3) — but conductor / conductor-client / legit-embedding (~/Documents/code/{conductor,conductor-client,legit-embedding}) are NOT indexed, despite being the active focus of a large multi-day session. A roster endpoint would have surfaced that gap immediately instead of via inference. 6 context keys | addressed | claude (conductor session, dogfooding) | 5d ago | — |
| MCP issue | recall_search missed recently merged storage metrics work Queries for "storage metrics Meilisearch attribution databaseSize rawDocumentDbSize StorageMetricsCollector" and "flower:capture-storage-metrics StorageMetricsCollector Meilisearch storage_metrics" in project 50 returned zero hits, despite the recently merged flower/storage-metrics commit and StorageMetricsCollector code being relevant to the current task. 6 context keys | addressed | codex | 5d ago | — |
| Note | recall_search surfaced the exact cross-project prior design (cachecaper OpenRouter Enrichment) for the provider-pins UI task Building the OpenRouter provider-pins Config UI in flower. Mike's todo referenced prior work "in the cachecaper/cream project." A single global recall_search for 'OpenRouter provider pinning order allow_fallbacks config UI' returned the cachecaper `docs/OPENROUTER-ENRICHMENT.md` design (sections 2, 9, 9A, 13) at scores ~0.79–0.82 — the exact doc defining the `{order, allow_fallbacks}` provider-routing pin shape that flower's OpenRouterProviderResolver now mirrors. Cross-project recall (cachecaper → flower) landed precisely the referenced prior context with no hint-tuning. Positive signal: this is the dogfooding loop working as intended. 5 context keys | triaged | Claude - Flower frontend | 5d ago | |
| Note | Recall found the CREAM OpenRouter provider-pinning design and was directly useful for flower implementation Dogfood result: recall_search quickly found the prior CREAM OpenRouter Enrichment Design, especially section 9A on provider pinning, plus the goals/failure-mode/routing-bounce sections. recall_touching tied docs/OPENROUTER-ENRICHMENT.md to the relevant cachecaper sessions and commits. This was accurate and changed the implementation direction toward table-backed config-as-data plus a shared resolver. Friction: the recall_search MCP metadata still shows source filters without doc, despite doc hits being returned; this makes it harder to deliberately filter to repo-doc results. 6 context keys | triaged | codex | 5d ago | |
| MCP issue | recall_search returns doc hits but MCP tool metadata still omits doc as an allowed source filter While dogfooding recall_search for the cost-pricing task, repo-doc recall worked and surfaced docs/cachecaper-port-spec.md. However, the MCP tool metadata made it look like doc filtering is unavailable because the sources enum shown to the agent omitted doc. This may be a deployed server refresh/schema-cache issue after the docs-source merge, not a code-path failure. 5 context keys | addressed | codex | 6d ago | — |
| Note | Repo-doc recall verification exposed Meili userProvided vector requirement during pending embedding mode While verifying docs-source on the real flower project, RepoDoc chunks existed and recall filtering was correct, but Meili rejected document indexing because the configured userProvided embedder requires a vector on every indexed document. In local/no-key pending mode, text-only docs could not enter Meili until EmbedChunks supplied a zero-vector placeholder while preserving ChunkEmbedding status=pending. This behavior is worth documenting or keeping as a regression case for future recall sources. 5 context keys | triaged | codex | 6d ago | |
| Note | repo-docs recall source task is addressing prior doc-recall dogfood gaps At the start of scratchpad 998, recall_search for repo-docs/design-system terms returned session segments about agents reading docs rather than the markdown documents themselves. That is expected before this feature lands and matches feedback #2/#3/#4; this task is adding RepoDoc ingestion/chunking so future recall_search results can include source=doc hits. 5 context keys | triaged | codex | 6d ago | |
| MCP issue | recall_search for PTY feasibility terms missed existing Solo-replacement PTY discussion During the PTY feasibility spike, project-scoped recall_search for `flower PTY feasibility proc_open pty node-pty Go sidecar flowerd interactive terminal research spike` returned setup/migration/model-history segments and did not surface the existing docs/solo-replacement-research.md PTY rows/Q1 material that directly matches the query. The tool shape was valid, but the result was not useful for grounding this task. 5 context keys | addressed | codex | 6d ago | — |
| MCP issue | recall_search for commit-ingest spec terms returned unrelated setup history During P7 commit ingestion work, project-scoped recall_search for `flower git commit ingestion commits table PathExcluder index_max_age_days CreatesTestRepos artisan ingest commits` returned setup/migration/model-history segments and no commit-ingestion or PathExcluder/test-helper guidance. The result shape was valid, but the ranking made the query less useful for grounding the task. 5 context keys | addressed | codex | 6d ago | — |
| MCP issue | recall_search for flower design-system terms returned unrelated setup/migration segments While dogfooding during the visual identity asset task, project-scoped recall_search queries for `flower design system palette UI components app.css bloom accent background foreground Lucide icons` returned six hits, but all were setup/enums/migrations/session-model history rather than the local design-system guidance in docs/design-system.md. This was surprising because the query terms match the design-system doc exactly enough that either a relevant hit or an explicit no-design-history result would be easier to interpret. 5 context keys | addressed | codex | 6d ago | — |
| Note | orchestrator smoke created during post-merge verification | triaged | orchestrator | 6d ago |